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Abstract 

Optical  interferometry  is  a  technique  for  obtaining  high-resolution  imagery  of  a  distant 
target  by  interfering  light  from  multiple  telescopes.  Image  restoration  from  interferometric 
measurements  poses  a  unique  set  of  challenges.  The  first  challenge  is  that  the  measurement 
set  provides  only  a  sparse-sampling  of  the  object's  Fourier  Transform  and  hence  image 
formation  from  these  measurements  is  an  inherently  ill-posed  inverse  problem.  Secondly, 
atmospheric  turbulence  causes  severe  distortion  of  the  phase  of  the  Fourier  samples.  We 
develop  array  design  conditions  for  unique  Fourier  phase  recovery,  as  well  as  a  compre¬ 
hensive  algorithmic  framework  based  on  the  notion  of  redundant-spaced-calibration  (RSC), 
which  together  achieve  reliable  image  reconstruction  in  spite  of  these  challenges.  Within 
this  framework,  we  see  that  classical  interferometric  observables  such  as  the  bispectrum  and 
closure  phase  can  limit  sensitivity,  and  that  generalized  notions  of  these  observables  can 
improve  both  theoretical  and  empirical  performance.  Our  framework  leverages  techniques 
from  lattice  theory  to  resolve  integer  phase  ambiguities  in  the  interferometric  phase  mea¬ 
surements,  and  from  graph  theory,  to  select  a  reliable  set  of  generalized  observables.  We 
analyze  the  expected  shot-noise-limited  performance  of  our  algorithm  for  both  pairwise  and 
Fizeau  interferometric  architectures  and  corroborate  this  analysis  with  simulation  results. 
We  apply  techniques  from  the  field  of  compressed  sensing  to  perform  image  reconstruction 
from  the  estimates  of  the  object's  Fourier  coefficients.  The  end  result  is  a  comprehensive 
strategy  to  achieve  well-posed  and  easily-predictable  reconstruction  performance  in  optical 
interferometry. 
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Introduction 


Optical  interferometry  is  a  multi-aperture  imaging  technique  which  is  attracting  increasing 
interest  in  the  astronomical  and  remote-sensing  communities.  The  appeal  of  this  technique 
is  primarily  due  to  the  high  resolution  it  affords  relative  to  single-aperture  imaging.  Namely, 
the  diffraction-limited  angular  resolution  of  a  single  aperture  is  proporitional  to  where  A 
is  the  wavelength  of  the  light,  and  D  is  the  diameter  of  the  aperture.  On  the  other  hand,  the 
achievable  angular  resolution  of  an  interferometer  is  instead  proportional  to  g^,  where 
Bniax  is  the  maximum  spatial  separation  of  any  two  telescopes  in  the  array.  Therefore  with 
interferometry  one  can  achieve  the  same  high  resolution  offered  by  an  extremely  large  (and 
often  prohibitively-costly)  telescope  by  interfering  light  from  several  telescopes  of  practical 
size  distributed  over  a  large  area. 

Image  reconstruction  in  interferometry  is  akin  to  the  more  general  problem  of  imaging 
with  sparse  samples  in  the  Fourier  domain,  which  is  encountered  in  many  other  fields  from 
medical  imaging  to  radio  astronomy.  Namely,  the  fundamental  problem  is  to  regularize 
the  ill-posed  reconstruction  of  N^sE/  resolution  elements  in  the  astronomical  scene  from 
m  <C  NresEi  Fourier  samples  derived  from  the  array's  interference  patterns.  Traditionally 
approaches  based  on  the  so-called  CLEAN  algorithm  due  to  Flogbom  (1974),  which  implicitly 
fit  the  measurements  to  an  image  model  consisting  of  a  sparse  collection  of  point-like  sources, 
have  been  used  to  solve  this  problem.  Given  the  success  of  this  relatively-simple  algorithm, 
it  is  hardly  surprising  that  recently-developed  techniques  leveraging  sophisticated  sparse 
image  models  (Wiaux  et  ah,  2009)  (Kurien  et  ah,  2014)  have  shown  promise.  In  fact  such 
algorithms  belong  to  a  burgeoning  family  of  cross-disciplinary  techniques,  which  are  based 
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on  seminal  work  within  the  last  decade  (Donoho,  2006)  (Candes  et  al,  2006),  in  the  nascent 
field  of  compressed  sensing.  These  algorithms  generally  take  advantage  of  the  fact  that  while 
undersampled  Fourier  measurement  sets  present  ill-posedness  in  that  they  do  not  uniquely 
define  an  arbitrary  underlying  image,  they  can  uniquely  specify  a  special  set  of  images  of 
practical  interest:  those  that  are  sparse  in  some  subspace. 

If  Fourier  samples  were  directly  available,  CS  techniques  would  directly  apply  to 
interferometric  measurements.  Flowever,  estimation  of  these  samples  is  itself  a  complicated 
inference  problem.  Interferometric  systems  interfere  signals  from  multiple  apertures  to 
produce  superpositions  known  as  fringes  which  encode  samples  of  the  Fourier  Transform 
of  the  object  under  observation.  We  begin  the  thesis  by  presenting  a  mathematical  model 
for  fringe  observation  as  well  as  sensitivity  limits  for  estimation  of  the  fringe  parameters 
in  our  introductory  Chapter  1.  Noisy  Fourier  samples  of  the  scene  under  observation  can 
then  be  derived  from  these  parameters.  Images  can  be  formed  by  estimating  the  intensity 
in  each  pixel  from  these  estimated  Fourier  samples,  which  are  known  as  complex  visibilities. 
A  fundamental  challenge  in  this  process  is  the  distortion  of  the  Fourier  phase  in  these 
samples  due  to  natural  variation  in  the  effective  path  lengths  to  the  target  observed  by  each 
aperture.  The  most  stressing  source  of  this  variation  is  turbulence  in  the  Earth's  atmosphere. 
This  turbulence  alters  the  effective  path  length  from  each  aperture  to  the  target  (i.e.  the 
so-called  optical  piston)  by  a  non-uniform  and  time-varying  amount,  which  in  turn  causes 
rapid  shifting  of  the  fringes  on  the  focal  plane.  If  uncorrected,  the  resulting  phase  noise  in 
the  Fourier  estimates  then  causes  severe  distortion  in  the  reconstructed  image. 

Techniques  for  elimination  of  this  phase  noise  fall  into  two  categories  according  to 
the  photon  flux  level  (or  equivalently  the  Signal-to-Noise  ratio  of  the  fringes).  At  high 
photon-flux  levels,  it  is  possible  to  directly  decouple  the  contributions  of  the  true  Fourier 
phase  from  the  confounding  phase  due  to  unknown  atmosphere-induced  piston.  While  the 
problem  is  ill-posed  in  general,  this  decoupling  can  be  performed  by  alternating  estimation 
of  the  Fourier  and  image  pixels  while  enforcing  a-priori  constraints  in  the  image  domain, 
such  as  source-sparsity  (Pearson  and  Readhead,  1984).  At  low  flux  levels,  on  the  other  hand. 
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the  individual  fringe  exposures  {or  frames)  are  typically  too  weak  for  such  techniques  to  be 
reliable.  Moreover,  the  random  phase  variation  across  frames  due  to  the  atmosphere  means 
that  these  fringe  measurements  cannot  be  directly  integrated.  Instead  atmosphere-invariant 
derivatives  ^  of  the  fringe  measurements  such  as  the  bispectrum  and  the  power  spectrum 
must  be  integrated  and  the  Fourier  coefficients  inferred  from  these  integrated  observables. 

This  thesis  represents  the  confluence  of  two  emerging  trends  in  interferometry  and  in 
inverse  imaging  problems  taken  collectively:  in  the  former,  there  is  a  need  for  approaches 
for  imaging  complex,  extended  objects  with  a  sparse  measurement  set  and  limited  a-priori 
knowledge,  and  in  the  latter.  Compressed  Sensing  techniques  continue  to  show  great 
promise  in  imaging  a  vast  range  of  natural  images  from  similarly-sparse  measurement 
sets.  While  standard  CS  techniques  tailored  to  linear-inverse  problems  and  do  not  apply  to 
atmosphere-invariant  interferometric  observables,  we  develop  and  validate  an  algorithmic 
interface  in  Chapter  2  permitting  these  techniques  to  be  used  successfully.  We  show 
empirically  that  for  compact  scenes  the  effect  of  the  atmosphere  can  be  eliminated  for 
a  wide  range  of  fluxes.  The  limiting  restriction  of  our  interface  to  compact  scenes  is  a 
direct  consequence  of  the  fundamental  ill-posed  nature  of  Fourier  phase  recovery  from 
atmosphere-perturbed  fringe  measurements.  Namely,  since  the  mapping  from  Fourier  phase 
to  the  set  of  measured  phases  is  rendered  non-injective  ^  by  the  atmosphere  in  general, 
we  must  introduce  constraints  upon  the  object  to  uniquely  determine  the  Fourier  phase. 
Such  regularization  techniques  and  the  object  constraints  they  impose  can  be  obviated  by 
introducing  redundancy  into  the  inter-aperture  spacings  in  the  interferometric  array.  This 
well-posed  framework  for  Fourier  phase  determination  forms  the  basis  of  Chapters  3  and  4. 

The  use  of  redundant  aperture  spacing  to  recover  the  Fourier  phase  in  the  presence  of 
piston  variation  in  a  well-posed  manner  is  known  as  redundant  spacing  calibration  (RSC). 
Of  the  myriad  RSC  techniques  that  have  been  developed  for  both  radio  and  optical  in- 

^By  derivative,  we  mean  an  observable  derived  from  the  fringe  measurements,  and  not  the  mathematical 
derivative. 

^By  non-injective,  we  simply  mean  that  a  given  measurement  set  can  be  produced  by  different  sets  of  Fourier 
phases  so  that  the  former  does  not  uniquely  determine  the  latter. 
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terferometry,  some  operate  on  the  measured  complex  visibilities  (which  we  call  Phasor 
approaches),  and  others  on  the  phase  component  of  these  complex  visibilities  (which  we  call 
Phase  approaches).  Even  with  redundancy,  a  fundamental  ambiguity  exists  in  RSC-based 
phase  recovery  which  is  rooted  in  the  27r-periodicity  of  the  interferometric  phase.  As  a  direct 
result  of  this  ambiguity,  a  unique  solution  for  the  complex  visibilities  given  the  measured 
visibilities  does  not  exist  in  either  Phase  or  Phasor  approaches.  However,  we  will  show  in 
Chapter  3  that  for  certain  patterns,  which  we  denote  wrap-invariant  patterns,  this  funda¬ 
mental  ambiguity  can  be  rendered  benign.  Namely,  for  such  patterns,  phase-unwrapping 
techniques  such  as  those  based  on  the  closest-vector-problem  (CVP)  formulation  due  to 
Larmes  and  Anterrieu  (1999)  can  successfully  recover  the  true  Fourier  phase  (modulo  27r). 
We  show  that  this  property  is  conferred  upon  arrays  whose  interferometric  graph  satisfies  a 
certain  cycle-free  condition.  We  specify  this  condition  and,  for  cases  in  which  this  condition 
is  not  satisfied,  we  provide  a  simple  algorithm  for  identifying  those  graph  cycles  which 
prevent  its  satisfaction.  We  apply  this  algorithm  to  diagnose  and  correct  a  member  of  a 
pattern  family  popular  in  the  literature.  The  end  result  of  this  set  of  analysis  is,  to  the  best 
of  our  knowledge,  the  first  sufficient  condition  on  an  aperture  pattern  for  unique  recovery 
of  the  Fourier  phase  from  interferometric  measurements. 

Having  established  conditions  for  unique  Fourier  phase  recovery,  we  then  turn  our 
attention  in  Chapter  4  to  the  issue  of  well-posed  and  practical  image  formation  with  wrap- 
invariant  RSC  patterns.  In  particular  we  consider  low-flux  scenarios  in  which  atmosphere- 
invariant  observables  must  be  integrated  over  many  frames  for  reliable  phase  estimation. 
In  hopes  of  improving  sensitivity,  we  generalize  the  classical  atmosphere-invariant  phase 
observables  (i.e.  the  bispectrum  and  closure  phase)  to  higher-order  observables,  which  we 
denote  as  the  n-spectrum  and  generalized  closure  phase  respectively.  We  extend  the  imiqueness 
results  in  Chapter  3  to  our  new  observables,  and  develop  a  novel  comprehensive  algorithm 
for  image  reconstruction  from  these  observables.  Here  we  leverage  the  notion  of  minimum 
cycle  basis  in  graph  theory.  A  standard  compressed  sensing  technique  known  as  total-variation 
minimization  is  used  to  perform  the  final  image  reconstructions.  We  present  numerical  and 
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visual  evidence  indicating  that  our  generalized  observables  can  yield  better  estimation 
performance  relative  to  their  classical  bispectrum  counterpart.  Moreover,  we  show  that 
our  algorithm's  performance  approaches  the  Cramer-Rao  Lower  Bound  for  this  estimation 
problem  in  a  realistic,  low-light  imaging  scenario. 

In  summary,  great  progress  has  been  made  recently  in  solving  standard  linear  inverse 
problems  in  imaging  as  part  of  the  development  of  compressed  sensing.  Simultaneously, 
many  algorithmic  techniques  have  been  developed  for  solving  the  specialized  ill-posed 
inverse  problem  in  optical  interferometry,  which  is  greatly  complicated  by  the  atmosphere. 
Characterizing  the  penalty  in  achievable  performance  associated  with  this  increased  com¬ 
plexity  has  remained  an  open  problem.  Hence  in  this  thesis  we  ask  the  question:  "'To  what 
extent  can  we  make  interferometric  imaging  immune  to  the  effects  of  the  atmosphere  in 
both  reliability  and  sensitivity?"'.  We  then  takes  steps  to  push  the  limits  of  this  immunity 
using  a  novel  algorithmic  framework  which  guarantees  unique  phase  recovery  via  the 
notion  of  wrap-invariant  array  design,  and  generalizes  the  notion  of  atmosphere-invariant 
observable  in  order  to  approach  theoretical  sensitivity  limits  at  low  flux  levels.  In  these  two 
pursuits,  our  framework  leverages  theory  and  algorithms  from  lattice  theory  and  graph 
theory,  respectively.  In  the  end  we  provide  strong  evidence  that  even  in  scenarios  thought 
to  be  stressing  from  an  SNR  perspective,  we  can  achieve  near-immunity  to  the  effects  of  the 
atmosphere. 
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Chapter  1 


Fundamentals  of  Optical 
Interferometry 

1.1  Chapter  Overview 

In  this  Section,  we  introduce  the  physics-based  principles  of  optical  interferometry,  thereby 
providing  a  foundation  for  the  main  results  of  this  thesis.  As  the  cornerstone  of  these  princi¬ 
ples,  the  Van-Cittert-Zernike  Theorem  establishes  the  interferometer  as  a  Fourier-imaging 
device,  or  in  other  words,  one  whose  output  can  be  easily-conceptualized  when  examined  in 
the  Fourier  domain.  We  then  provide  a  derivation  of  the  sensitivity  limits  for  the  interfero¬ 
metric  measurement  of  image  parameters  in  the  shot-noise-limited  case.  This  derivation 
is  based  on  the  well-known  Cramer-Rao  Lower  Bound  (CRLB)  in  estimation  theory,  and 
corroborates  previous  results  due  to  Zmuidzinas  (2003).  The  CRLB  will  prove  useful  as  a 
performance  benchmark  for  the  algorithms  developed  in  this  thesis.  We  then  proceed  to 
introduce  the  role  of  atmospheric  turbulence  and  other  sources  of  interferometric  phase 
noise  in  interferometry.  Finally  we  define  a  few  important  parameters  of  an  interferometer: 
resolution  element,  Field-of-View,  and  undersampling  ratio. 
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1.2  Interferometric  Architectures 


The  main  appeal  of  interferometry  over  other  astronomical  imaging  techniques  is  that  it 
affords  the  same  high  resolution  offered  by  an  extremely  large  (and  often  prohibitively- 
costly)  telescope  by  interfering  light  from  several  telescopes  of  practical  size.  Optical 
interferometers  image  a  scene  by  sampling  the  2D  Fourier  Transform  of  the  scene.  Several 
excellent  surveys  provide  the  theoretical  basis  of  interferometry  as  well  as  the  practical 
issues  involved  in  building  and  operating  an  interferometer,  including  those  by  Labeyrie 
et  al.  (2006),  Glindemann  (2011),  and  Buscher  (2015).  Each  pair  of  telescopes  measures  a 
single  angular  spatial  frequency  of  radians,  where  b  is  the  vector  difference  of  the 
telescope  positions,  which  is  known  as  a  baseline.  For  an  array  of  Nap  apertures,  the  data  set 
then  consists  of  all  (^'’)  such  measurements. 

There  are  two  popular  beam  combination  architectures  in  use  in  optical  interferometry: 
the  pairwise  combination  scheme,  and  the  Fizeau  combination  scheme.  The  two  schemes 
are  illustrated  in  Figure  1.1.  Suppose  we  have  a  telescope  array  consisting  of  Nap  apertures, 
each  of  which  collects  n  photons  from  a  distant  source.  In  the  Fizeau  scheme,  light  from 
all  telescopes  is  interfered  on  a  single  focal  plane,  forming  an  interference  pattern  known 
as  a  fringe  for  each  telescope  pair.  Each  fringe  encodes  a  sample  of  the  scene's  2D  Eourier 
Transform.  In  the  pairwise  scheme,  light  from  each  aperture  is  split  Nap  ways  and  combined 
with  that  of  each  other  aperture  to  form  a  single  fringe  pattern  on  separate  focal  planes. 
Hence  each  of  the  focal  planes  receives  a  small  fraction  of  the  total  signal  photons. 

Ideally  the  interferometer  provides  a  perfect  encoding  of  the  scene's  sampled  Eourier 
Transform.  In  practice,  however,  we  never  observe  pristine  interference  fringes  in  either 
architecture.  The  quality  of  the  fringes  is  degraded  by  statistical  fluctuations  in  the  arrival 
times  of  photons  on  the  focal  plane,  which  is  a  phenomenon  known  as  shot  noise.  We  will 
quantify  the  impact  of  shot  noise  further  in  the  course  of  the  thesis.  For  now,  we  note  that 
the  Fizeau  architecture  uses  all  light  to  form  each  fringe  while  incurring  shot  noise  from  all 
Napn  photons,  whereas  the  pairwise  architecture  uses  a  fraction  of  the  light  to  form  each 
fringe  and  incurs  shot  noise  due  to  two  apertures.  A  fundamental  result  due  to  Zmuidzinas 
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Figure  1.1:  The  two  popular  beam  combination  schemes  in  optical  interferometry 
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(2003)  quantified  this  tradeoff  and  established  the  superior  overall  sensitivity  of  the  Fizeau 
(or  all-in-one)  scheme  with  respect  to  the  pairwise  scheme.  Because  the  Fizeau  architecture  is 
also  typically  simpler  to  implement,  it  is  often  preferred  in  practice.  The  results  in  this  thesis 
are  therefore  geared  toward  the  Fizeau  architecture,  although  we  will  leverage  pairwise 
architecture  in  Chapter  4  as  a  conceptual  springboard  for  subsequent  analysis. 

1.3  The  Van-Cittert-Zernike  Theorem 

To  derive  the  Theorem,  let  us  model  a  source  in  the  sky  as  a  superposition  of  differential 
emitting  elements  of  size  AQ.  Consider  then  the  electric  field  at  the  j-th  aperture  of  an 
interferometric  array  which  is  a  distance  d  from  one  such  differential  patch.  This  can  be 
written  as: 


Ej{Cl)  =  (1.1) 

where  Rj  is  the  distance  from  the  object  to  the  ;-th  aperture,  co  is  the  angular  frequency 
of  the  wave,  and  c  is  the  speed  of  light.  For  the  purposes  of  this  analysis,  we  can  assume 
that  the  distance  between  the  source  and  the  array  is  much  larger  than  the  inter-aperture 
distances,  i.e.  Rj  =  d,  where  d  is  the  same  for  all  apertures. 
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Figure  1.2:  The  Fizeau  Interferometer  Concept 

The  fields  from  all  of  the  point  sources  in  the  interferometer's  field-of-view  are  super¬ 
imposed  on  the  interferometer's  focal  plane.  For  the  so-called  Fizeau  interferometer,  this 
superposition  is  performed  by  a  beam-combiner,  which  directs  the  light  collected  by  each 
aperture  onto  the  focal  plane  with  an  aperture-specific  wave-vector  rj  as  shown  in  the  Figure 
below. 

We  can  now  compute  the  field  contributed  to  an  arbitrary  point  on  the  focal  plane  p  by 
each  aperture,  which  will  be  a  function  of  the  respective  wave-vectors  and  vector  distances 
from  p.  Namely 


^  fl(njAn  _  g/(ky.xy(p))gia;(t-^) 

where  ky  is  the  wavevector  of  the  outgoing  wave  from  aperture  j  at  the  beam-combiner, 
and  Xj{p)  is  the  vector  position  to  p  relative  the  j-th  aperture  position  at  the  beam-combiner. 

As  we  show  in  Appendix,  we  can  rewrite  this  expression  in  terms  of  the  lateral  aperture 
position  Tj  at  the  beam  combiner  as: 
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^  fl(n)An  _ 

Cl 

where  (pj  is  an  aperture-dependent  phase  which  is  independent  of  p. 

Now  consider  the  field  superposition  at  the  pixel  at  vector  coordinate  p  on  the  focal 
plane  (see  the  picture  below),  which  can  be  written  as: 


E(Q)  ^  a(nm  _ 

where  we  have  neglected  the  attenuation  incurred  by  the  field  en-route  to  the  focal  plane  for 
simplicity.  To  obtain  the  field  for  the  complete  source,  we  now  integrate  over  all  differential 
patches  (which  we  call  point  sources)  in  the  source  to  obtain: 


Etotal  = 


The  detector  at  pixel  p  measures  the  intensity  Q  of  the  superposition.  Namely, 


Q(P)  =  \Etotair  =  EtotalEl 


'“P  Rj 


Q{p)  = 

;=i 


dQ2  ■  fl*(Q2)(X] 


The  final  measurement  is  a  photon  count  which  is  proportional  to  the  time-averaged 
intensity.  Hence,  after  re-arranging  and  time-averaging,  we  obtain: 


“P  .  Ri 


■EJ  ;  =  1  k=l 

We  can  simplify  this  expression  greatly  by  noting  that,  with  a  few  exceptions,  astronomi¬ 
cal  sources  are  spatially-incoherent,  which  means  that  the  complex  amplitudes  of  radiated 


10 


Distant  Emitter 

□ 


^Rjk  =  hkj  ■  n 
hkj 

Figure  1.3:  Path  difference  between  two  apertures 

waves  from  different  points  are  uncorrelated.  Namely, 

(fl(Qi)fl*(Q2))  =  f(Q)^i(Qi -Q2)  (1.9) 

The  result  is  that  the  double  integral  above  collapses  to  a  single  integral  in  Q.  We  can  now 
write  the  sum  of  all  pairwise  products  in  Equation  (1.8)  as  the  following  single  summation 
over  all  aperture  pairs: 

(Q)  =  ^  I  i{n)Napdn+ 

1  Np  Nap 

4  +  /  l(Q)e-^(^)dQ)  (1.10) 

where  the  first  term  represents  the  sum  of  the  products  of  each  aperture  with  itself  (i.e. 
terms  of  the  form  EjE*). 

To  understand  the  meaning  of  Equation  (1.10),  it  is  useful  to  approximate  ARj,  using  the 
geometry  illustrated  in  Eigure  1.2.  Let  us  define  the  unit  vector  originating  at  aperture  1 
and  pointing  in  the  direction  of  the  emitting  source  as  n.  Eor  an  arbitrary  source  location, 
this  unit  vector  would  of  course  differ  for  each  aperture.  However  in  most  astronomical 
scenarios,  the  inter-aperture  distance  is  extremely  small  compared  with  the  distance  between 
the  source  and  the  aperture  array  (Buscher,  2015).  In  such  cases  it  is  reasonable  to  assume 
that  the  direction  vector  n  is  approximately  constant  across  the  array.  As  shown  in  Eigure 
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1.3,  we  have: 


ARjk  =  hkj  ■  n  (1.11) 

where  is  the  vector  difference  position  of  the  k-th  and  j-th  apertures. 

Moreover,  we  can  write  the  differential  patch  dCl  in  terms  of  components  of  this  vector 
n  as: 

dQ  =  {d^)dnxdny  (1.12) 

Substituting  Equations  (1.11)  and  (1.12)  into  Equation  (1.10)  and  noting  that  ^  we 

obtain: 


(Q)  —  J  I{^)Napdnxdny-\- 

^“P  p  2n\j  p  2n\: 

+  I{Cl)e-^^^-^Unxdny)  (1.13) 

k=l  j=l 

Note  that  the  two  integrals  are  f*(^)  and  F(^),  respectively,  where  F('^)  is  the  2D 

b 

Eourier  Transform  of  the  source  evaluated  at  spatial  frequency  Eor  simplicity  of  notation, 
let  us  index  all  of  the  aperture  pairs  {k,])  with  a  single  index  h.  Defining  (pp  :=  (ppj,  and 
F/j  :  =  F(  X )  substituting,  we  have: 


(Q(p))  =  f  I{Cl)Napdnxdny+  ^  ^  p*e-Kp-^r,+^H)  (piq) 

h=l 

(Nflp) 

(Q(p))  =  ^oNap  +  X]  2|F;,|cos(p  ■  Arp  +  ZFp  +  (pp)  (1.15) 

h=l 

From  this  last  equation,  we  see  that  focal  plane  consists  of  a  superposition  of  2D  sinusoids 
{or  fringes),  each  of  which  encodes  a  distinct  Fourier  component  of  the  source.  This  result  is 
the  Van-Cittert-Zernike  Theorem  which  was  first  derived  by  Pieter  van  Cittert  in  1934  (van 
Cittert,  1934)  and  then  proved  in  simpler  fashion  by  Frits  Zernike  in  1938  (Zernike,  1938). 
For  subsequent  analysis  in  this  thesis,  it  will  be  convenient  to  parameterize  the  fringes 
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using  the  following  Definitions: 


Definition  1.3.1.  The  complex  fringe  phasor  z  at  a  given  spatial  frequency  is  given  by:  z  := 

Note  that  the  magnitude  of  the  fringe  phasor  will  be  proportional  to  the  brightness  of 
the  object.  It  is  also  useful  to  have  a  quantity  describing  the  strength  of  a  given  Fourier 
component  relative  to  the  overall  brightness  of  the  object.  This  relative  measure  is  provided 
by  the  complex  visibility: 

Definition  1.3.2.  The  complex  visibility  v  of  an  object  at  a  given  spatial  frequency  is  given  by 

the  ratio  v  =  — 

to 

Definition  1.3.3.  The  visibility  7  of  an  object  at  a  given  spatial  frequency  is  the  modulus  of 

If  I 

the  complex  visibility,  i.e.  7  :=  ^ 

Definition  1.3.4.  The  Fourier  phase  9  of  an  object  at  a  given  spatial  frequency  is  the  argument 
(i.e.  phase)  of  the  complex  visibility,  i.e.  9  :=  ZFj,. 

With  Definition  1.3.3,  we  can  write  Equation  (1.16)  as: 

(Nap) 

(Q(p))  =  FoNap  +  X]  2jhFoCOs{p  ■  Arp  +  ZFp  +  (pp)  (1.16) 

h^l 

The  actual  photon  counts  y  observed  at  the  detectors  are  directly  proportional  to  the 
intensity  present.  Let  the  proportionality  constant  relating  intensity  to  photon  counts  be 
given  by  k,  so  that  we  have: 

(Nap) 

(y(p))  =  KFoNap  +  K  ^  2jhFoCOs{p  ■  Arp  +  ZFp  +  +(ph)  (1.17) 

h=l 

Moreover  we  assume  that  light  is  conserved  throughout  the  fringe  generation  process, 
and  hence  the  photon  counts  must  sum  to  the  total  number  T  of  photons  incident  upon  the 
array,  i.e.  T  =  nNap,  i.e. 
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^  =  E(y(p))  =  E 

p  p 


/  (~f)  \ 

KFoNap  +  K  2jhFoCOs{p  ■  Arp  +  AFp  +  +(ph) 

\  / 


(1.18) 


In  this  thesis,  we  will  consider  the  scenario  in  which  the  periods  of  the  fringe  sinusoids 
are  all  integer  multiples  of  pixels  on  the  focal  plane.  This  is  one  of  the  so-called  DFT 
conditions  commonly  employed  in  interferometry  to  avoid  fringe  estimation  bias  due  to 
spectral  leakage  (Gordon,  J.  A.  and  Buscher,  D.  R,  2012).  In  this  case  the  sinusoidal  term 
vanishes  in  the  sum  in  Equation  (1.18)  leaving  only  the  constant  (first)  term,  and  we  have: 


=  Nj  xFoNap  =  nN, 


^ap 


(1.19) 


which  in  turn  implies  that  k  =  and  hence  by  substitution  we  obtain: 


fp 


(y(p))  = 


nN, 


ap 


(Nap) 


+ 


2n 


F^fp  h^l 


Y,  ^hC0s{p  ■  Arp  +  Oh +  (ph) 


(1.20) 


1.4  The  Cramer-Rao  Bound  for  Fourier  Coefficients  of  Image  Source 

There  are  two  principal  sources  of  noise  that  affect  this  measurement.  We  have  already 
alluded  to  the  shot  noise  arising  from  the  fact  that  photon  arrivals  are  not  uniform  in  time 
even  if  incident  intensity  is  constant.  The  resulting  uncertainty  in  photon  counts  for  a 
given  exposure  time  is  well-modeled  as  a  Poisson  distribution  with  mean  AT,  where  A  is 
proportional  to  the  intensity  and  T  is  the  exposure  time.  The  second  source  of  noise  is  read 
noise,  which  is  thermal  noise  in  the  detection  circuitry.  Read  noise  is  well-modeled  by  the 
Gaussian  distribution.  Since  shot  noise  is  proportional  to  intensity  whereas  read  noise  is 
not,  the  former  dominates  the  latter  in  high  light-level  scenarios.  And  with  recent  advances 
in  electronics,  the  light-level  regime  in  which  read  noise  is  important  continues  to  diminish. 
We  will  therefore  focus  on  detection  limits  for  the  shot-noise-limited  case. 

Recall  Equation  (1.20)  provides  an  expression  for  the  time-averaged  intensity.  To  obtain 
lower  bounds  on  sensitivity,  let  us  assume  that  the  phase  differences  AOj,  are  known  and 
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therefore  calibrated.  We  can  write  the  observation  model  in  matrix  form  as: 


(y) 


In 


cos(p(l)  ■  Aki) 
cos(|D(2)  ■  Aki) 


—sin{p{\)  ■  Aki) 
—sin{p[l)  ■  Aki) 


cos(p(l)  •  Ak2) 
cos(p(2)  ■  Ak2) 


—sin(p(l)  ■  Akz) 
—sin(p(2)  ■  Akz) 


(1.21) 

Suppose  that  the  interferometer's  output  is  a  beam  with  an  extent  of  Nfp  pixels  on  the 
focal  plane.  Letting  A  be  the  Ny^p-by-(2(^'’)  +  1)  matrix  in  the  equation  above  (the  so-called 
visibility-to-pixel  matrix  (V2PM)),  we  can  write  this  matrix  equation  compactly  as: 


Re[vi] 
Im  [z;i] 
Re[v2] 
Im  [^2] 


y  = 


nNgp^j 


2n 


Av 


(1.22) 


where  denote  the  constant  ones  vector  of  length  Nfp,  v  are  the  complex  visibilities. 
The  actual  pixel  counts  can  be  modeled  as  i.i.d.  Poisson  random  variables  with  means  y,  i.e. 


z~Poisson(y).  (1.23) 

Following  a  similar  derivation  given  in  Harmany  et  al.  (2012),  we  can  now  write: 


N/p  /  T  nZ, 


p(z|y)  exp(-ejy) 


(1.24) 


where  e,  is  the  unit  vector  of  the  z-th  canonical  basis.  Substituting  for  y,  we  obtain  the 
negative  log-likelihood  as: 


N 


7p 


f  (/)  =  iLy  -  E  Nlogiefy)  +  C 


(1.25) 


where  1  is  an  all-ones  vector  of  size  Nfp,  and  C  is  a  constant  independent  of  v.  Let 
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A  :=  ^A.  Then  the  gradient  with  respect  to  v  is  given  by: 


%  z- 

VF(/)  =  A^l  -  (1.26) 

6;  V 


Therefore  the  Hessian  is: 


V2f(/)  =  A^  f  (1-27) 

/=i  (e/y) 

To  obtain  the  Fisher  information  matrix  1(f),  we  take  the  negative-expectation  of  this 
quantity: 

1(f)  =  -E[V^F{f)]  =  -A^  f  -^^eiejA  (1.28) 

,=i  (e/y) 

But  since  E[z,]  =  ejy,  this  simplifies  to: 


1(f)  =  A^DA 


(1.29) 


where  D  is  a  diagonal  matrix  with  D,-,-  =  jjyj- 

Therefore  the  Cramer  Rao  Lower  Bound  (CRLB)  for  the  variance  of  each  estimated 
Fourier  coefficient  is  given  by: 


varifii)  >  [1(f)  %  (1.30) 

This  expression  matches  the  result  given  by  Zmuidzinas  (2003). 

1.5  Key  Parameters  of  an  Interferometer:  Resolution,  Field-of-View, 
and  Undersampling  Ratio 

In  the  Section  1.3  we  established  that  the  interference  fringe  generated  by  a  pair  of  apertures 
separated  by  a  vector  b  encodes  the  amplitude  and  phase  of  the  Fourier  Transform  of 
the  scene  at  spatial  frequency  ^ .  For  real  images  each  baseline  actually  contributes  two 
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Figure  1.4:  The  relationship  between  aperture  patterns  and  Fourier  sampling 
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spatial  frequency  samples,  as  the  Fourier  Transform  of  real  images  is  conjugate  symmetric. 
As  an  example,  consider  the  aperture  pattern  in  the  left  panel  of  Figure  1.4.  Suppose  we 
are  interfering  light  at  a  wavelength  of  500  nm.  The  baseline  hu  formed  by  apertures 
1  and  2  generates  a  fringe  encoding  the  complex  visibility  vi2  at  a  spatial  frequency  of 
^  =  (2e6,6e6)  cycles /radian,  and  its  conjugate  vj‘2  at  a  spatial  frequency  —  (2e6,6e6) 
cycles/ radian. 

Consider  the  Cartesian  frequency  sampling  grid  with  gridpoint  spacing  Amin  as  shown 
in  Figure.  Suppose  we  sample  a  2D-signal  in  the  Fourier  domain  with  sampling  interval 
^sjreq  =  2k  along  each  dimension.  By  the  well-known  Nyquist  Sampling  Theorem,  we  know 
that  only  images  spatially-limited  to  ±R  along  each  dimension  can  be  uniquely  represented 
with  these  samples;  images  beyond  this  extent  will  suffer  from  aliasing.  Therefore  an 
interferometer  whose  sampled  spatial  frequencies  lie  on  a  grid  with  spacing  Amin  = 
has  an  unambiguous  Field-of-View  area  of  =  ^4— .  Conversely,  by  the  space-frequency 

^min  ^min 

duality  of  the  Nyquist  Theorem,  we  know  that  we  can  uniquely  represent  an  image  ban- 
dlimited  to  frequency  extent  ±L  by  sampling  in  space  at  an  interval  of  As, space  =  21  ■  Given 
the  maximum  spatial  frequency  observed  by  our  interferometer  is  L  =  Amax  =  ^^2  the 
bandlimited  approximation  of  the  image  is  uniquely  defined  by  samples  spaced  apart 
along  each  dimension.  Therefore  the  size  of  the  smallest  resolvable  element  in  the  image. 
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Figure  1.5:  Representation  of  a  scene  as  a  Field-of-View  comprised  of  resolution  elements 


FOVy 


A 


which  we  call  the  resolution  element,  is  given  by  Figure  1.5  depicts  the  notions  of 

Field-of-View  (FOV)  and  resolution  element  (resEl)  visually. 

With  the  areas  of  the  Field-of-View  and  resolution  element  in  hand,  we  can  now  compute 
the  number  of  resolution  elements  in  the  bandlimited  approximation  of  the  image  as  the 
ratio  of  these  two  quantities,  or: 


Nre^Us  =  (1.31) 

where  r  = 

^min 

We  now  define  our  undersamplmg  p  as  the  ratio  of  the  number  of  distinct  Fourier 
samples  to  the  number  of  resolution  elements  f^resEis  hi  the  image.  Each  Fourier  sample 
we  measure  is  actually  a  pair  of  samples  since  for  real  images,  the  value  of  the  Fourier 
Transform  at  a  given  spatial  frequency  is  the  conjugate  of  that  at  the  corresponding  negative 
spatial  frequency.  Hence  we  have: 


,  _  _  1  (Nap\ 

^  NresEls  2r^  \  2  J 


(1.32) 
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1.6  Atmospheric  and  Instrumental  Phase  Noise 


So  far  our  analysis  has  assumed  that  the  effective  path  between  the  beam-combiner  and 
the  target  depends  merely  on  the  source-instrument  geometry  depicted  in  Figure  1.2.  In 
practice  there  are  several  deterministic  and  random  sources  of  path  variation  across  the 
aperture  array.  Common  deterministic  sources  of  delay  include  the  optical  path  in  the 
instrument  itself.  The  predominant  random  sources  of  delay  are  atmospheric  turbulence 
and  instrumental  vibration.  Atmospheric  turbulence,  which  is  typically  the  more  stressing 
of  these  two  sources,  alters  the  optical  path  traversed  by  the  wave  arriving  at  each  aperture 
in  a  nonuniform  and  time-varying  manner. 

Let  us  consider  the  total  path  alteration  due  to  all  of  the  sources  mentioned.  Let  us  then 
define  the  corresponding  phase  shift  resulting  from  this  alteration  at  aperture  j  as  .  If 

these  phase  shifts  are  carried  through  the  analysis  in  the  previous  section,  we  obtain: 

(Nap. 

nN 

{yip))  =  £  lkjCOs{p  ■  Arkj  +  9kj  +  (pkj)  (1.33) 

‘^fp  (Kj) 

where  ^kj  '■  =  (pk, total  ~  ^j, total  is  the  difference  between  the  total  phase  shifts  at  the  two 
apertures  in  the  baseline  associated  with  aperture  pair  (/,/). 

Let  us  now  examine  a  traditional  method  for  eliminating  the  effect  of  this  phase  noise 
which  was  first  suggested  by  Jermison  (1958)  in  the  context  of  inteferometry  at  radio  wave¬ 
lengths.  Suppose  the  atmosphere  adds  phases  (pj  and  to  apertures  k  and  j,  respectively. 
The  result  is  that  the  observed  Fourier  phasor  is  given  by: 

yjk  =  (1.34) 

We  can  eliminate  such  nuisance  factors  by  forming  a  triple  product  g  of  the  Fourier 
phasors  along  a  triangle  of  baselines  (e.g.  gi23  :=  V12V23V31).  As  shown  in  Figure  1.6, 
this  special  triple  product  (known  as  the  bispectrum)  cancels  the  atmospheric  phase  terms, 
leaving  only  the  desired  Fourier  information  (i.e.  the  {0}).  The  phase  of  the  bispectrum  Zg, 
which  is  known  as  the  closure  phase,  can  hence  be  used  to  recover  estimates  for  the  Fourier 
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Figure  1.6:  Eliminating  the  effect  of  atmospheric  turbulence  with  phase  closure 


The  Atmosphere  Problem 


^jk  =  yjkC 


Fourier  triple  products  (bispectra)  cancel  atmosphere 

8123  =  V12V23V3I 


Zvi2  —  O12  —fpl 


ZV23  —  O23  + 


+  ^V3i  =  f^3i  —ffT) 

^§123  “  ^12  +  ^23  +  ^^31 


phases.  Consider  an  interferometer  that  measures  all  baselines  among  Nap  apertures.  Out 
of  the  (^'’)  possible  triangles,  only  of  the  associated  closure  phase  relations  are 

independent  (Readhead  et  ah,  1988).  If  we  combine  these  closure  phases  with  the  Fourier 
magnitude  estimates  for  each  of  the  i^ff)  baselines,  we  now  have  a  set  of  atmospheric- 
invariant  observables  with  which  we  can  attempt  image  reconstruction. 
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Chapter  2 


Leveraging  compressed  sensing 
techniques  in  optical  interferometry 

2.1  Chapter  Overview 

Optical  interferometers  image  a  scene  by  sampling  the  spatial  frequencies  that  comprise 
the  2D  Fourier  Transform  of  the  scene.  The  sample  set  is  typically  much  smaller  than  the 
number  of  resolvable  elements  in  the  instrument's  field-of-view  (FOV).  In  the  related  field  of 
radio  interferometry,  compressed  sensing  (CS)  techniques  based  on  recent  seminal  work  (see, 
e.g.,  Wiaux  et  al.  (2009))  have  proven  successful  in  regularizing  the  ill-posed  problem  arising 
from  this  undersampling.  In  contrast,  atmospheric  turbulence  precludes  the  availability  of 
direct  Fourier  phase  information  in  optical  interferometry,  and  hence  CS  techniques  do  not 
directly  apply.  We  have  developed  and  validated  a  robust  algorithmic  interface  between  the 
Fourier  magnitude  and  bispectrum  observables  available  in  the  optical  interferometry  and 
CS  regularizations  known  as  Total  Variation  Minimization  for  which  fast  solvers  exist. 

2.2  Problem  Statement  and  Approach 

There  are  two  principal  challenges  that  one  encounters  when  trying  to  reconstruct  im¬ 
ages  from  optical  interferometric  measurements.  Each  aperture  pair  (or  baseline)  in  an 
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interferometer  samples  the  Fourier  transform  of  the  image  in  its  field  of  view  at  a  spatial 
frequency  j,  where  b  is  the  length  of  the  baseline,  and  A  is  the  wavelength  of  light  collected. 
Interferometers  rarely  measure  enough  baselines  to  fully  sample  the  Fourier  transform. 
Flence  reconstruction  of  the  image  from  such  undersampled  measurement  set  is  an  ill-posed 
(or  underdetermined)  problem. 

In  this  way  image  reconstruction  in  interferometry  is  akin  to  the  general  problem  of 
inference  from  sparse  Fourier  samples  which  is  encountered  in  many  other  fields  including 
medical  imaging  and  radio  astronomy.  In  the  case  of  interferometry,  we  must  recover  inten¬ 
sities  for  NresEi  resolution  elements  in  the  scene  under  observation  from  m  <C  ^resEi  spatial 
frequency  measurements  corresponding  to  the  available  baseline  pairs  in  the  telescope  array. 
To  make  the  problem  well-posed,  additional  constraints  must  be  enforced.  Traditionally 
regularization  approaches  based  on  the  so-called  CLEAN  algorithm  due  to  Flogbom  (1974), 
which  implicitly  fits  the  measurements  to  an  image  model  consisting  of  a  sparse  collection 
of  point-like  sources,  have  been  used  to  solve  this  problem. 

It  is  well-known  in  imaging  that  successful  regularization  schemes  apply  constraints 
which  accurately  capture  the  properties  of  the  image  we  wish  to  reconstruct.  Likewise 
the  success  has  been  largely  limited  to  astronomical  scenes  which  closely  match  its  point- 
source-collection  assumption.  The  astronomical  community  has  hence  been  led  to  consider 
more  generally-applicable  prior  models.  Restricting  the  size  of  an  image's  Ll-norm  as 
applied  to  either  directly  to  the  image  pixels,  their  gradient,  or  their  wavelet  coefficients, 
has  proven  remarkably  effective  in  capturing  the  inherent  compressibility  in  natural  images. 
This  observation  and  techniques  that  exploit  it  are  at  the  core  of  the  new  field  of  compressed 
sensing  (CS)  (Candes  et  al,  2006)  (Donoho,  2006).  Though  a  wide  variety  of  CS  techniques 
have  arisen  in  the  past  decade,  the  vast  majority  of  them  regularize  problems  with  a  linear 
observation  model: 


y  =  Fx  -|-  n 


(2.1) 


1 


The  Ll-norm  of  a  vectorized  image  x,  denoted  ||x||  j,  is  the  sum  of  the  absolute  values  of  the  entries  in  x 
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where  y  is  the  measurement  vector,  x  is  a  vector  representing  the  unknown  signal  or 
image,  F  is  the  measurement  matrix,  and  n  represents  additive  measurement  noise. 

While  CS  techniques  have  been  applied  successfully  to  reconstruction  in  radio  inter¬ 
ferometry  (Wiaux  et  al.,  2009),  optical  interferometry,  on  the  other  hand,  poses  additional 
challenges  beyond  the  Fourier-undersampling  problem  discussed  above.  Namely,  atmo¬ 
spheric  turbulence  necessitates  a  non-linear  formulation  of  the  reconstruction  problem. 
Recall  from  Chapter  1  that  there  are  (^“'’)  unknown  Fourier  phases  as  well  as  Nap  —  1 
imknown  piston  differences,  and  hence  inference  of  the  Fourier  phases  from  the  {^2)  phase 
measurements  available  in  a  non-redimdant  array  is  inherently  ill-posed.  The  traditional 
means  of  mitigating  this  issue  has  been  to  again  leverage  prior  constraints  on  the  image  (e.g. 
by  enforcing  compactness,  sparsity,  or  smoothness).  In  particular,  a  myriad  of  so-called  self¬ 
calibration  algorithms  have  been  developed  (see,  e.g.,  Pearson  and  Readhead  (1984))  which 
alternate  between  estimation  of  the  Fourier  phases  and  estimation  of  the  image  coefficients 
subject  to  the  imposed  constraints.  Such  algorithms  have  shown  to  be  successful  in  high-flux 
scenarios  in  which  the  interferometric  measurements  have  a  high  signal-to-noise  ratio  (SNR). 
In  low-flux  scenarios  on  the  other  hand,  one  must  integrate  observables  over  a  long  time 
period  in  order  to  build  sufficient  SNR  for  reliable  image  reconstruction.  Since  integration 
beyond  the  coherence  time  of  the  atmosphere  would  result  in  fringe  blurring,  one  is  then 
led  to  the  formation  and  subsequent  integration  of  atmosphere-invariant  observables  from 
each  atmosphere-coherence  time  {or  frame). 

In  Section  1.6  we  introduced  the  bispectrum  observable  as  the  classical  basis  for 
atmosphere-invariant  inference  in  optical  interferometry.  Formed  as  the  product  of  three 
fringe  phasors  associated  with  the  sides  of  a  baseline  triangle  (e.g.  bii,  b23,  and  bsi), 
the  bispectrum  is  a  non-linear  function  of  the  complex  visibilities  of  the  image.  Recall 
that  the  atmosphere-induced  terms  cancel  in  these  products  and  hence,  like  the  Fourier 
magnitudes,  these  so-called  bispectra  are  atmosphere-invariant  observables.  However,  for  a 
non-redundant  array  with  (^)  distinct  baselines,  recovery  of  the  Fourier  phases  from  the 
bispectra  phases  (i.e.  the  closure  phases)  remains  ill-posed  since  there  are  only  (^2^^)  indepen- 
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dent  closure  phases  (Readhead  et  al,  1988).  Successful  bispectra-based  image  reconstruction 
remains  feasible  in  spite  of  this  ill-posedness  (see  e.g.  Thiebaut  (2013),  Besnerais  et  al.  (2008)), 
but  again  prior  constraints  (e.g.  on  the  image  support)  must  be  enforced  to  regularize  the 
reconstruction. 

An  important  property  of  the  bispectrum  which  we  will  exploit  in  our  reconstruction 
method  is  its  invariance  to  scene  translation.  To  illustrate  this  property,  let  us  denote  the 
spatial  frequency  vectors  of  two  sides  of  a  bispectrum  triangle  as  u  and  v,  respectively. 
Then  the  spatial  frequency  of  the  remaining  side  will  be  w  :=  —  (u  +  v).  If  ^(p)  is  the 
corresponding  (normalized)  bispectrum  of  a  scene  at  its  original  position  p,  then  by  the 
shift  property  of  the  Fourier  transform,  the  bispectrum  of  the  translated  image  becomes: 

^(p_l_^p)  =  gi(eu+u-<5p)g/(0v+v-.5p)g;(e-(u+v)-(u+v)-^p)  ^2.2) 

=  g/(0u+0v+0-(u+v))  =  ^(p)  (2.3) 

where  we  have  dropped  the  aperture  subscripts  for  purposes  of  generality. 

Since  the  bispectra  are  non-linear  functions  of  their  underlying  image,  reconstruction 
from  these  observables  does  not  map  directly  onto  the  linear  compressed  sensing  framework 
in  Equation  (2.1).  To  address  this  challenge,  we  developed  an  interface  which  first  recovers 
Fourier  component  estimates  from  bispectra  and  Fourier  magnitudes,  using  a  nonlinear 
least  squares  solver.  With  the  Fourier  component  estimates  at  hand,  the  image  estimate 
can  then  be  recovered  using  standard  compressed  sensing  techniques.  The  approach  is 
diagrammed  in  Figure  2.1  below. 


2.3  Description  of  Stage  1  of  Algorithm 

The  goal  of  Stage  1  of  our  algorithm  is  to  use  the  measured  bispectra  and  Fourier  magnitudes 
to  recover  estimates  of  the  true  Fourier  components  of  the  image.  Recall  from  Section  1.3 
that  the  fringes  provide  measurements  of  the  Fourier  magnitudes  directly.  Flence  it  remains 
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Figure  2.1:  Overview  of  Proposed  Two-Stage  Approach 


An  Algorithmic  Interface 


to  recover  the  Fourier  phase  information.  Since  there  are  unknown  Fourier  phases  in 
a  non-redimdant  array  but  only  (^°2  independent  closure  phases,  this  sub-problem  is 
itself  ill-posed.  Namely,  we  carmot  hope  to  find  a  unique  solution  for  the  Fourier  phases 
simply  by  minimizing  the  residuals  of  the  bispectra  data.  Instead  our  algorithm  employs  a 
gradient-based  search  to  find  the  phase  set  that  minimizes  a  regularized  objective  consisting 
of  a  data  term  and  a  prior  term.  The  data  term  is  the  sum  of  the  squared  bispectrum 
residuals.  The  prior  term  is  a  metric  that  favors  those  Fourier  phase  vectors  corresponding 
to  a  compact  image  intensity  This  kind  of  joint  metric  has  been  suggested  for  optical 
interferometric  applications  before  in  the  work  of  Thiebaut  (2013).  Flowever,  whereas  the 
method  in  Thiebaut  (2013)  searches  for  a  metric-minimizing  image,  our  method  first  searches 
for  the  metric-minimizing  baseline  phase  set,  which  is  inherently  a  smaller  set  given  the 
undersampled  nature  of  the  problem. 


0  =  argminfrfata(0/g/a)  -F  pi  prior 


(2.4) 


The  data  (or  residual)  and  prior  terms  are  defined  as  follows. 


fdataid,  g/a) 


Nf, 

E 


k=l 


1 

—  {gk  -  akiakiaksexpHiOki  +  0jc2  + 

ZVk 


(2.5) 


where  is  the  k-th  bispectrum,  are  the  unknown  Fourier  phases  in  the  k-th 

bispectrum,  the  {aj^i}  are  estimates  for  the  magnitudes  of  the  Fourier  coefficients  in  the  k-th 
bispectrum,  and  Wk  is  a  weighting  factor  proportional  to  the  estimated  variance  of  the  k-th 
bispectrum. 


fpnor(0/a)  —  ||Re[F  ya^g]  O  (1  hgfluss!fln)||  (2.6) 

where  F  is  the  partial  DFT  matrix  whose  rows  are  2D-sinusoidal  basis  functions  sampled 
by  the  array,  ya,0  is  the  vector  of  estimated  Fourier  coefficients  with  magnitudes  given  by  a 
and  phases  given  by  0,  hgaussian  denotes  a  centered  Gaussian  window,  and  the  operation  o 
denotes  point-wise  multiplication. 

An  intuitive  interpretation  of  the  regularization  above  can  be  obtained  by  decomposing 
the  computation  into  steps.  We  seek  to  fit  the  bispectra  with  a  set  of  Fourier  phases  6  which 
together  with  the  estimated  Fourier  magnitudes  produce  an  image  with  compact  energy. 
The  data  term  in  the  metric  (data  is  a  sum  of  the  squared  residuals  of  measured  bispectra 
with  respect  to  the  bispectra  associated  with  the  phase  vector  6  and  Fourier  magnitude 
estimates  a.  The  steps  for  computation  of  the  prior  term  are  as  follows.  First  we  associate 
the  phases  in  9  with  their  corresponding  magnitude  estimates  in  a  to  form  estimates  for 
the  complex  Fourier  components  ya^g.  We  then  apply  the  pseudo-inverse  of  the  partial 
DFT  matrix  F  to  the  vector  ya,g  and  retain  the  real  part  thereby  obtaining  an  image-space 
representation  of  ya,g.  Finally  we  perform  point- wise  multiplication  of  this  image  with  a 
function  of  the  form  shown  in  Figure  and  compute  the  norm  of  the  result.  This  operation 
computes  a  compactness  metric  which  penalizes  ya,g  in  proportion  to  its  spatial  energy 
spread.  Note  that  the  actual  position  of  the  Gaussian  taper  is  immaterial;  as  established 

^Since  the  rows  of  F  are  orthogonal  and  therefore  the  pseudo-inverse  is  obtained  by  a  Hermitian  transpose 
operation  F* 
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Figure  2.2:  Inverted  Gaussian  penalty  function  (1  —  hgaussian):  the  dark  region  represents  an  area  of  near-zero 
penalty 


above,  the  bispectrum  is  a  translation-invariant  measurement  and  hence  the  joint  metric  will 
be  invariant  to  shifts  of  the  same  object. 

The  joint  data-prior  metric  in  Equation  (2.4)  can  be  minimized  using  a  suitable  non¬ 
linear  least-squares  solver.  For  our  processing,  we  have  elected  to  use  the  MATLAB® 
implementation  (MATLAB,  2012)  of  the  trust-region  reflective  non-linear  least-squares 
solver. 

2.4  Description  of  Stage  2  of  Algorithm 

Stage  2  takes  the  Fourier  estimates  obtained  in  Stage  1  and  attempts  to  recover  an  image 
using  fast  compressed  sensing  techniques.  This  amounts  to  solving  the  linear,  under¬ 
determined  inference  problem: 


y  =  Fx  -|-  n  (2.7) 

where  y  is  the  vector  of  Fourier  estimates  from  Stage  1,  F  is  the  partial-DFT  matrix  whose 
rows  are  the  complex-sinusoidal  basis  functions  corresponding  to  the  spatial  frequencies 
sampled  by  the  array^,  and  n  is  the  (unknown)  residual  error  in  these  estimates.  Since 


^Note  that  F  need  not  be  the  same  as  F,  as  we  are  free  to  assume  an  independent  pixel  resolution  in  each 
case.  This  resolution  then  specifies  the  granularity  of  the  sampling  of  the  sinusoids  forming  the  rows  of  the 
matrix. 
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in  practice  there  are  fewer  available  Fourier  measurements  than  the  desired  number  of 
resolution  elements  in  the  image,  this  inference  problem  is  ill-posed  from  a  Nyquist-sampling 
perspective.  Hence  once  again  we  must  regularize  the  problem  by  imposing  constraints 
on  the  image.  We  selected  two  different  compressed  sensing  regularization  strategies 
due  to  their  widespread  success  in  solving  similar  inverse  problems  involving  Fourier 
imdersampling.  The  first  of  these  approaches.  Basis  Pursuit  Denoising  (BPDN),  seeks  to  find 
the  image  of  smallest  LI -norm  which  also  agrees  with  the  estimated  Fourier  component 
estimates  to  within  a  certain  tolerance  e.  We  performed  this  Ll-minimization  in  the  wavelet 
domain  as  opposed  to  the  pixel  domain,  as  it  is  well-known  that  natural  images  tend  to  be 
more  compressible  in  the  former  domain  than  in  the  latter.  The  other  regularization  we 
tested.  Total  Variation  Minimization  (TV-Min)  is  very  similar,  but  seeks  instead  to  minimize 
another  quantity  known  to  be  compressible  in  natural  images:  the  Ll-norm  of  the  gradient 
of  the  image.  The  two  regularization  techniques  are  specified  mathematically  below: 


(Basis  Pursuit  Denoising):  x  =  argmin  || a ||^  subject  to: 

a 


FY  ^a-y 


<  e 

2 


(2.8) 


(Total  Variation  Minimization):  x  =  argmin  ||a||j.y  subject  to:  ||Fa  —  y  II2  <  e  (2.9) 

a 

where  ||a||j.y  is  the  Ll-norm  of  the  2D  pixel  gradient  of  the  image,  i.e.  Hajljy  := 

The  software  package  NESTA  (Becker  et  ah,  2011)  was  used  to  perform  the  Stage  2 
optimizations  above. 


2.5  Algorithm  Performance 

2.5.1  Laboratory  Validation 

To  test  the  effectiveness  of  our  algorithm  when  applied  to  real  interferometric  data,  we 
collected  fringe  data  with  MIT  Lincoln  Laboratory's  (MIT/LL)  Fizeau  Interferometer.  The 
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Figure  2.3:  Non-redundant  Golay  20-aperture  pattern  (left)  and  corresponding  UV-sampling  (right) 


interferometer  is  a  reconfigurable  fiber-coupled  system  that  allows  all  (2^)  baseline  conjugate 
pairs  to  be  measured  simultaneously  by  projecting  light  from  all  apertures  onto  a  Fizeau 
beam-combiner.  The  aperture  pattern  used  in  the  experiments  is  shown  in  Figure  2.3.  The 
ratio  r  of  the  maximum  baseline  to  the  minimum  baseline  was  17  in  both  x  and  y  spatial 
coordinates.  Hence  the  number  of  resolution  elements  required  to  uniquely  specify  our 
diffraction-limited  scene  was  (2r)^,  yielding  an  undersamplmg  ratio  of  p  =  ~  0.33. 

The  interferometer  was  illuminated  with  the  far-field  projection  of  a  range  of  targets:  chrome- 
on-glass  transparency  masks  were  photolithographically-prepared  and  placed  at  the  focus 
of  an  ^  three-mirror  off-axis  telescope.  When  the  target  is  illuminated  with  the  white  light, 
the  far-field  projection  of  the  target  is  produced  at  the  telescope  aperture,  where  it  can  be 
sampled.  As  a  reference  for  reconstruction,  we  consider  a  reduction  of  the  actual  target  to 
the  fundamental  diffraction-limited  resolution  of  the  instrument  as  shown  in  Figure  2.4. 

5000-frame  reconstruction  results  are  shown  in  Figure  for  the  following  photoelectron 
(pe)  levels:  (from  left  to  right)  320e3  pe/ aperture /frame,  54e3  pe/ aperture /frame,  and  15e3 
pe/ aperture/ frame. 
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Figure  2.4:  Target  chrome  mask  (left)  and  corresponding  truth  image  at  diffraction-limited  resolution  (right) 
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Figure  2.5:  Image  Reconstruction  Results  from  Laboratory  Validation 
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Figure  2.6:  Image  Reconstruction  Results  from  Simulations 
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2.5.2  Simulation 

In  addition  to  the  Laboratory  validation,  we  also  conducted  5000-frame  simulations  with  the 
same  aperture  pattern  and  the  measurement  model  (2)  to  compare  algorithm  performance  in 
the  presence  of  imknown  atmospheric  distortion  against  that  if  this  distortion  were  known. 
Note  that  in  the  latter  "atmosphere  oracle"  case,  bispectrum  formation  is  not  required  and 
direct  CS  reconstruction  from  complex  amplitudes  is  feasible.  Hence  this  comparison  allows 
a  quantitative  measure  of  the  reconstruction  penalty  suffered  from  bispectrum  formation. 
The  simulation  is  depicted  in  the  block  diagram  in  Figure  2.6. 

We  used  two  Image  Quality  Metrics  to  assess  performance:  the  standard  metric  Normal¬ 
ized  Mean-Squared  Error  (NMSE)  and  the  Structural  Similarity  (SSIM)  metric  (Wang  et  ah, 
2004).  The  NMSE  is  defined  as: 


NMSE  =  20  log 


X  -  Xq 

xo 


(2.10) 


Neither  metric  captured  the  visual  similarity  of  the  raw  reconstructions  of  our  two-stage 
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Figure  2.7:  Image  Reconstruction  Results  from  Simulations 
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Figure  2.8:  Algorithm  Performance  in  Simulation 
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algorithm  relative  to  ground  truth.  To  address  this,  we  uniformly  hard-thresholded  each 
of  the  raw  reconstruction  pixels  before  evaluating  the  metrics.  Figure  2.7  shows  our  two- 
stage  reconstructions  before  thresholding  (left),  the  same  reconstructions  after  thresholding 
(middle),  and  the  atmosphere-oracle  reconstructions  (right).  Results  from  both  a  low-light- 
level  scenario  (top  row)  and  a  high-light-level  scenario  (bottom  row)  are  shown.  Figure  2.8 
uses  NMSE  and  SSIM  image  quality  metrics  to  quantify  reconstruction  performance  for  our 
two-stage-algorithm  reconstructions  after  thresholding  (red),  and  the  atmosphere-oracle 
reconstruction  (black).  We  see  that  the  algorithm  achieves  performance  close  to  that  of  the 
atmosphere  oracle  for  a  wide  range  of  light  levels. 
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Chapter  3 


Pattern  design  criteria  for  uniqueness 
in  phase  recovery 

3.1  Chapter  Overview 

We  provide  new  results  enabling  robust  interferometric  image  reconstruction  in  the  presence 
of  unknown  aperture  piston  variation  via  the  technique  of  Redundant  Spacing  Calibration 
(RSC).  The  RSC  technique  uses  redundant  measurements  of  the  same  interferometric 
baseline  with  different  pairs  of  apertures  to  reveal  the  piston  variation  among  these  pairs. 
In  both  optical  and  radio  interferometry,  the  presence  of  phase-wrapping  ambiguities  in 
the  measurements  is  a  fundamental  issue  that  needs  to  be  addressed  for  reliable  image 
reconstruction.  In  this  chapter,  we  show  that  these  ambiguities  affect  recently-developed 
RSC  phasor-based  reconstruction  approaches  operating  on  the  complex  visibilities,  as 
well  as  traditional  phase-based  approaches  operating  on  their  logarithm.  We  also  derive 
new  sufficient  conditions  for  an  interferometric  array  to  be  immune  to  these  ambiguities 
in  the  sense  that  their  effect  can  be  rendered  benign  in  image  reconstruction.  We  show 
the  implications  of  this  property  of  wrap-invariance  for  imaging  via  phase  closures  and 
extend  existing  results  involving  the  classical  three-baseline  closures  to  generalized  closures. 
Furthermore  we  show  that  this  property  is  conferred  upon  arrays  whose  interferometric 
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graph  satisfies  a  certain  cycle-free  condition.  We  specify  this  condition  and,  for  cases  in 
which  this  condition  is  not  satisfied,  we  provide  a  simple  algorithm  for  identifying  those 
graph  cycles  which  prevent  its  satisfaction.  We  apply  this  algorithm  to  diagnose  and  correct 
a  member  of  a  pattern  family  popular  in  the  literature. 

3.2  Preliminaries 

3.2.1  Lattices 

A  lattice  is  a  mathematical  object  describing  a  repeating  pattern  of  discrete  points  in  space. 
It  arises  in  a  variety  of  scientific  and  engineering  contexts  including  crystallography  and 
communication  theory.  We  will  use  the  following  Definition  for  a  lattice  A: 

VfliGzj  (3.1) 

If  the  generating  vectors  v,  are  not  linearly-independent,  one  might  ask  for  another, 
more  compact  description  of  A.  Consider  one  such  candidate  set  of  linearly-independent 
vectors  by.  If  all  integer  linear  combinations  of  by  lie  in  A,  and  every  lattice  point  in  A  can 
be  represented  as  an  integer  linear  combination  of  the  by,  then  we  say  that  the  vectors  vy 
form  a  basis  for  the  lattice  A. 

Many  lattice  algorithms,  including  algorithms  for  the  Closest  Vector  Problem  introduced 
in  the  next  section,  require  a  reduced  lattice  basis.  A  reduced  basis  is  one  consisting  of  vectors 
which  are  short  and  nearly-orthogonal.  A  well-known  procedure  to  create  a  basis  for  a 
given  lattice  with  short,  nearly-orthogonal  vectors  is  known  as  the  LLL  algorithm  due  to 
Lenstra  et  al.  (1982). 

An  LLL-based  routine  for  forming  a  reduced  lattice  basis  from  a  set  of  generating  vectors 
which  are  not  necessarily  linearly-independent  is  given  in  Cohen  (1993). 
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3.2.2  The  Closest- Vector-Problem 


A  well-known  problem  in  lattice  theory  is  the  Closest-Vector-Problem,  which  can  be  described 
as  follows: 

Problem  3.2.1.  Given  a  basis  {bi,b2,  ...,hn}/or  a  lattice  in  R""  and  a  vector  w  G  W^,find  the 
point  in  the  lattice  closest  in  Euclidean  distance  from  w. 

Several  algorithms  exist  for  solving  this  problem.  A  popular  class  of  algorithms,  known 
as  the  Sphere-Decoding  algorithms,  are  efficient  searches  for  the  closest  lattice  point  within 
a  hypersphere  of  a  certain  radius  centered  on  the  input  vector  (see  e.g.  Agrell  et  al.  (2002)). 
For  the  simulations  in  this  chapter,  we  instead  use  the  lower-complexity  Bahai  Nearest 
Plane  (Babai-NP)  algorithm  (Babai,  1986).  For  lattice  bases  which  are  nearly  orthogonal,  this 
algorithm  offers  reliable,  albeit  not  guaranteed,  performance  in  practice.  Pseudo-code  for 
one  implementation  of  this  algorithm  due  to  Galbraith  (2012)  is  given  in  listing  Algorithm  1. 

Algorithm  1  Babai  Nearest-Plane  Algorithm 

Input:  Basis  for  lattice  A  (i.e.  {hi, . . .  ,h„}),  and  w  G  R"* 

Output:  Element  v*  nearest  to  w  in  A 

Compute  orthogonal  basis  {bp . . . ,  b* }  using  Gram-Schmidt  procedure 
for  i  =  n  downto  1  do 
Set  /•  -  ^ 

Set  ji  = 

Set  w;_i  =  w,  -  (/,•  -  [li])h*  -  [li]hi 

end  for 

return  v  =  yi  -|-  . . .  y„ 


A  conceptual  interpretation  of  this  recursive  algorithm  is  as  follows.  At  the  root  level  of 
course  the  algorithm  finds  the  vector  in  A  closest  to  w;  in  other  words  it  solves  the  CVP 
problem  (A„,  w).  First  consider  the  subspace  U  spanned  by  the  first  n  —  1  basis  vectors, 
i.e.  U  =  spanfbi, . . .  ,b„_i}.  The  algorithm  begins  by  computing  the  translate  y  G  A  of 
U  closest  to  w,  i.e.  the  nearest  plane  to  w.  The  algorithm  then  implicitly  computes  the 
projection  w'  of  w  onto  this  translate  U  -\-y.  It  then  translates  this  projection  back  to  the 
origin  (i.e.  w„_i  :=  w'  —  y),  and  solves  the  smaller  problem  within  U,  i.e.  the  CVP  problem 
(A„_i,  w„_i),  where  A„_i  is  the  lattice  formed  by  the  first  n  —  1  basis  vectors.  The  process 
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continues  recursively  until  the  final  CVP  problem  with  the  lattice  spanned  by  bi  is  solved. 
The  final  output  is  then  the  sum  of  the  translates  from  each  level  of  the  recursion. 

3.3  Problem  Statement  and  Related  Work 

In  Chapter  2  we  developed  an  algorithmic  interface  which  regularizes  the  ill-posed  problem 
of  recovering  Fourier  phase  from  the  bispectrum  by  enforcing  prior  constraints  on  the  image. 
An  alternative  and  intrinsically  well-posed  approach  to  prior-regularized  reconstruction  is 
to  use  baseline  redundancy  to  explicitly  solve  for  OPD  variation;  an  array  with  baseline 
redundancy  contains  repeated  instances  of  the  same  baseline  involving  distinct  aperture 
pairs.  Since  Fourier  phases  can  be  assumed  to  be  equal  for  all  repeated  baselines,  an  observed 
difference  amongst  their  corresponding  measurements  exposes  the  contribution  of  the  OPD. 
This  idea  of  using  redundant  arrays  to  calibrate  out  OPD  variation,  known  as  redundant 
spacing  calibration  (RSC),  was  developed  in  works  such  as  those  by  Arnot  et  al.  (1985)  and 
Greenaway  (1990).  In  recent  years,  irmovation  in  optical  technology  has  engendered  a 
revival  of  interest  in  the  RSC  technique.  The  simultaneous  (or  Fizeau-style)  measurement 
of  fringes  on  a  common  focal  plane  has  long  been  a  popular  method  of  acquiring  many 
baseline  measurements  in  an  economical  marmer.  However,  the  Fizeau  method  had  been 
incompatible  with  RSC  techniques  since  the  fringes  formed  by  each  set  redundant  baselines 
would  alias  on  the  focal  plane.  An  elegant  solution  to  this  problem  was  proposed  by  Perrin 
et  al.  (2006).  This  work  developed  the  idea  of  segmenting  the  entrance  pupil  of  a  single 
telescope  into  an  RSC  arrangement  of  sub-pupils  from  which  the  light  was  then  coupled 
via  single  mode  fiber  to  a  non-redundant  exit  pupil,  thereby  permitting  unambiguous 
and  simultaneous  fringe  detection  for  an  RSC  array.  A  reconstruction  algorithm  for  this 
architecture  was  then  proposed  in  Lacour  et  al.  (2007).  Even  more  recently,  RSC  has  been 
implemented  as  the  calibration  scheme  of  choice  for  several  radio  interferometers:  the 
Donald  C.  Backer  Precision  Array  for  Probing  the  Epoch  of  Reionization  (PAPER)  in  South 
Africa  (see  Ali  et  al.  (2015)),  the  MIT  Epoch  of  Reionization  (MITeOR)  in  the  United  States 
(see  Zheng  et  al.  (2014)),  and  the  Ooty  Radio  Telescope  (ORT)  in  India  (see  Marthi  and 
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Figure  3.1:  Fraction  of  redundant  baselines  required  for  critical  redundancy  vs.  aperture  count 
Chengalur  (2014)). 

As  will  be  shown  below,  N  —  3  independent  redundant  relations  are  required  for  unique 
determination  of  atmosphere  and  Fourier  phases  -  a  condition  we  will  refer  to  as  critical 
redundancy.  An  oft-cited  drawback  of  the  RSC  approach  is  that  it  reduces  the  number  of 
unique  spatial  frequencies  measured  by  the  interferometer.  However,  as  Figure  3.1  illustrates, 
the  fraction  of  distinct  uv-samples  sacrificed  for  critical  redundancy  becomes  increasingly 
negligible  as  the  number  of  apertures  in  the  array  increases.  Nevertheless  the  RSC  technique 
presents  other  challenges  which  must  be  overcome  for  reliable  imaging.  Central  among 
these  challenges  is  the  problem  of  integer  phase  ambiguities  which  arise  from  the  fact  that 
the  interferometric  phase  is  only  known  modulo  2n.  Indeed  these  ambiguities  have  been 
shown  to  play  an  important  role  in  accurately  recovering  sensor  complex  gains  and  object 
complex  visibilities  while  imaging  with  real  interferometric  instruments  (see  e.g.  Liu  et  al. 
(2014),  Eastwood  et  al.  (2009)).  In  this  chapter,  we  describe  these  ambiguities  and  how  they 
can  be  mitigated  using  a  combination  of  lattice  theory  algorithms  and  careful  array  design. 
We  will  see  that  these  ambiguities  have  a  fundamental  presence;  namely,  they  exist  whether 
the  calibration  strategy  works  with  complex  visibilities  (which  we  call  the  Phaser  approach) 
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directly,  or  their  respective  logarithms  (which  we  call  the  Phase  approach).  To  the  best  of 
our  knowledge,  the  results  in  this  chapter  are  the  first  to  provide  array  conditions  allowing 
imambiguous  interferometric  phase  determination  in  spite  of  wrap  ambiguities  in  the  Phase 
approach,  and  corresponding  false  minima  in  the  objective  of  the  Phaser  approach. 

To  motivate  the  analysis  in  the  chapter,  we  provide  an  example  illustrating  the  effect 
that  wrap  ambiguities  can  have  in  RSC-based  image  reconstruction.  Consider  the  pattern 
depicted  in  the  Figure  3.2.  This  pattern  belongs  to  one  of  the  more  popular  array  classes  in 
the  interferometry  literature:  the  so-called  Y-patterns  (see  e.g.  Arnot  et  al.  (1985),  Blanchard 
et  al.  (1996),  Labeyrie  et  al.  (2006),  Eastwood  et  al.  (2009),  Liu  et  al.  (2014)).  The  corresponding 
spatial,  or  UV,  sampling  is  provided  in  the  right  panel  of  the  Figure. 

To  demonstrate  the  potential  effect  of  wrap  ambiguities  on  reconstruction,  we  simulated 
both  Phase-  and  Phasor-based  reconstruction  from  noiseless  measurements  with  this  pattern. 
The  results  are  shown  in  Figure  3.3.  The  upper  left  panel  shows  the  true  image,  and  the 
upper  right  panel  shows  the  inverse  Fourier  transform  of  the  UV-sampled  visibility  function 
of  the  object  (i.e.  the  so-called  dirty,  or  interferometric  image).  The  lower  left  panel  shows  the 
reconstruction  result  with  an  implementation  of  the  Phase  method  similar  to  that  developed 
by  Lannes  (2003),  and  the  lower  right  panel  shows  the  same  for  an  implementation  of  the 
Phasor  method  (Marthi  and  Chengalur,  2014)  (Lacour  et  ah,  2007).  Reconstruction  suffers 
from  phase  wrapping  error  in  the  former  case,  and  a  corresponding  false-minimum  trap 
in  the  Phasor  case.  In  the  course  of  this  chapter,  we  will  first  identify  this  ambiguity  from 
a  mathematical  perspective,  relate  it  to  a  particular  physical  structure  (i.e.  the  existence 
of  a  certain  type  of  loop  in  the  interferometric  graph),  and  provide  a  simple  algorithm  for 
identifying  such  structures  in  an  arbitrary  array  so  that  they  can  be  remedied. 

The  chapter  is  organized  as  follows.  In  Section  3.4,  we  review  previous  work  on  the 
integer  ambiguity  problem,  and  discuss  its  presence  in  the  Phase  approach.  We  provide 
new  mathematical  conditions  for  an  aperture  pattern  to  be  wrap-invariant,  meaning  that 
the  effect  of  the  27r-periodicity  of  the  measured  interferometric  phase  can  be  eliminated 
in  image  reconstruction.  These  results  are  founded  upon  techniques  from  lattice  theory. 
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Figure  3.2:  Y-pattern  Array  Example 
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Figure  3.3:  Reconstruction  Results  for  Y-Pattern  Example 
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as  well  as  the  well-known  Smith  Normal  Form  (SNF)  of  an  integer  matrix.  We  show  the 
implications  of  these  results  on  imaging  with  three  types  of  interferometric  observables: 
the  baseline  phase  measurements,  their  traditional  closure  phases,  and  generalized  closure 
phases.  In  Section  3.5,  we  relate  these  mathematical  conditions  to  conditions  on  the  aperture 
pattern  itself.  Namely  we  show  that  wrap-invariance  is  conferred  upon  arrays  satisfying  a 
certain  loop-free  condition.  As  an  illustrative  example,  we  diagnose  a  pattern  belonging 
to  the  popular  Y-pattern  class  and  remedy  it  to  be  loop-free.  In  Section  3.6,  we  show  that 
the  computationally-complex  SNF-based  approach  for  ambiguity  resolution  is  actually  not 
necessary  for  wrap-invariance  in  many  cases,  as  long  as  a  wrap-induced  image  shift  can  be 
tolerated  ^ .  For  such  scenarios,  we  show  that  wrap-invariance  provides  a  certificate  for  the 
success  of  existing  Phase  and  Phasor-based  approaches  in  avoiding  wrap-induced  errors 
in  the  former,  and  false  global  minima  in  the  latter.  Finally  we  summarize  our  results  in 
Section  3.7. 

3.4  Phase  Wrapping  Ambiguities  in  RSC  Image  Reconstruction 

In  this  Section,  we  describe  a  Phase  Approach  algorithm  leveraging  the  CVP-approach  for 
phase-wrap  resolution,  which  to  the  best  of  our  knowledge  was  first  developed  in  Larmes 
and  Anterrieu  (1999).  We  begin  by  identifying  the  fundamental  phase  ambiguity,  and 
subsequently  assess  its  impact  on  the  phase  error  in  the  RSC  phase  solution.  In  the  process, 
we  develop  mathematical  conditions  for  wrap-invariance  which  will  form  the  basis  for  the 
notion  of  a  wrap-invariant  pattern  in  Section  3.6.  Finally,  we  assess  our  results  in  the  context 
of  similar  results  for  approaches  requiring  the  use  of  closure  phases  (Cannes,  2003). 

3.4.1  Identifying  the  Fundamental  Phase  Ambiguity 

The  traditional  approach  to  RSC  reconstruction  operates  on  the  measured  baseline  phases 
(see  e.g.  Arnot  et  al.  (1985),  Greenaway  (1994)).  To  illustrate  the  approach,  let  us  consider  an 

^As  will  be  shown,  this  image  shift  is  distinct  from  the  fundamental  degeneracy  of  object  position  in 
interferometric  measurements 
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interferometer  which  operates  at  a  wavelength  A  with  two  apertures  (say,  i  and  j)  separated 
by  a  vector  baseline  distance  of  b,y.  In  the  absence  of  any  optical  path  difference,  the 
interference  pattern  formed  by  these  two  apertures  encodes  a  sample  of  the  object's  Fourier 
Transform  at  spatial  frequency  Let  the  true  Fourier  phase  (which  we  will  refer  to  as 
object  phase),  measured  by  this  interference  pattern  be  denoted  as  Op.  The  measured  phase  is 
given  by: 


f^ij  —  dij  +  </’;■  ~  (pi  +  2/16  (3.2) 

where  (pj  —  (pi  is  the  optical  path  difference  between  apertures  j  and  i,  and  e  is  unknown 
phase  wrap  integer  arising  from  the  fact  that  interferometric  phase  measurements  are  only 
known  modulo  2n. 

Consider  an  interferometric  array  which  simultaneously  makes  many  such  measurements 
amongst  its  N  apertures.  Suppose  that  of  the  array's  (^)  baselines,  d  of  them  are  distinct. 
Further  suppose  we  have  a  solution  set  {cpi}  and  {Ojj}  for  these  equations.  Let  r,-  denote  the 
vector  position  of  the  z-th  aperture.  As  noted  by  several  authors  (see,  e.g.  Wieringa  (1992)), 
we  can  obtain  another  valid  solution  set  simply  by  replacing  each  (pi  with  (p^  =  (pi +  (po  +  z-  r,, 
and  each  with  =  9ij  —  z  ■  (ry  —  r,),  for  arbitrary  (pQ  and  z.  Since  the  free  vector  z  is  a 
two-parameter  vector  representing  the  inherently-ambiguous  position  of  the  image  within 
the  Field-of-View  and  the  free  parameter  (pQ  is  simply  a  scalar  piston  offset,  the  kernel  of  the 
RSC  system  is  three-dimensional.  This  is  the  tilt-position  degeneracy  which  is  fundamental  in 
interferometric  reconstruction.  The  end  result  is  that  the  RSC  system  contains  d  unknown 
distinct  object  phases  and  N  unknown  aperture  pistons,  and  is  rank-deficient  by  at  least 
3.  This  implies  that  there  are  at  most  (d  N  —  3)  linearly-independent  equations  in  the 
RSC  system,  and  hence  at  most  N  —  3  redundant  relations  can  be  linearly-independent. 
Remaining  relations  can  be  expressed  as  linear  combinations  of  the  elements  in  this  set. 
A  commonly-used  simple  example  of  such  dependencies  (Greenaway,  1994)  is  shown  in 
Figure  3.4.  The  phase  measurement  1634  associated  with  the  baseline  ^34  can  be  expressed  as 
a  linear  combination  of  those  associated  with  the  other  baselines,  i.e.  1634  =  /3i2  —  j3i3  /l24- 
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Figure  3.4:  Example:  Dependent  Parallelogram  redundancy. 


Figure  3.5:  Example:  5-aperture  RSC  pattern.  The  six  distinct  baselines  are  shown. 


We  will  assume  for  the  remainder  of  the  chapter  that  our  array  contains  N  —  3  indepen¬ 
dent  relations.  Under  this  assumption,  we  could  in  principle  solve  for  a  particular  solution 
of  this  system  by  arbitrarily  setting  two  object  phases  (whose  spatial  frequencies  are  not 
co-linear)  and  one  piston  phase.  This  particular  solution  would  then  differ  from  the  true 
solution  by  a  phase  ramp  in  the  Fourier  domain,  corresponding  to  an  image  shift  in  the 
spatial  domain.  In  this  section  we  will  instead  construct  a  different  particular  solution  which 
is  immune  to  the  effects  of  phase-wrapping  by  design. 

As  an  example,  consider  the  simple  array  in  Figure  3.5.  There  are  (2)  =  10  baselines,  of 
which  4  are  redundant.  A  critical  array  of  5  apertures  would  have  2  redundancies.  Therefore 
this  array  possesses  more  redundancies  than  necessary  (call  it  strongly  redundant),  and  we 
anticipate  that  the  resulting  system  will  be  overdeter  mined. 

The  measurement  equations  associated  with  this  array  can  be  written  in  matrix  form: 
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1  0  0 
0  1  0  0  0  0  0 
0  0  1  0  0  0  0 


0  0  0  1  -1 
1 
0 


0  0  0  1  0  0  0  0 

0  0  0  0  1  0  1  0 

0  0  0  1  0  0  0  1 

0  0  -1  0  0  0  1  0 

-10  0  0000  0 
0  0  0  0  0  1  0  1 

0  1  0  0  0  0  1  0 


0  0 

-1  0 

1  -1 


0 

0 
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0 

-1 

0 

0 

1 

0 

0 


1  -1 

0  0 

-1  0 

-1  0 

0  -1 

0  -1 

0  -1 


@12 

023 

034 

045 

013 

025 

<pl 

(p2 

(p3 

(p4 

h 


=  +  27re 


(3.3) 


Denoting  the  matrix  above  as  M,  we  can  write  such  systems  in  compact  form  as: 


M 


e 

•P 


=  jS  +  Ine 


(3.4) 


The  example  above  illustrates  that  the  general  phase  measurement  matrix  will  have  two 
sets  of  columns:  one  corresponding  to  the  object  phases,  and  one  corresponding  to  the  path 
differences.  We  can  divide  the  range  of  the  matrix  into  two  respective  subspaces  as  follows: 


Definition  3.4.1.  Two  fundamental  subspaces  (Lannes  and  Anterrieu,  1999):  The  spectral 
phase  space  K  is  the  span  of  the  d  columns  associated  with  the  Fourier  phases,  or  equivalently 
the  range  of  the  sub-matrix  Mg  formed  by  these  columns.  The  piston,  or  aberration,  phase 
space  L  is  the  span  of  the  Nap  columns  associated  with  the  piston  phases,  or  equivalently 
the  range  of  the  sub-matrix  formed  by  these  columns. 

If  we  let  n  =  (^)  be  the  number  of  baselines  in  the  array,  the  phase  measurement  matrix 
M  will  be  of  size  n  x  [d  +  N).  For  a  strongly-redundant  array  like  the  one  in  the  example 
above,  the  column-space  fC  -|-  L  of  the  matrix  will  clearly  not  span  R”.  Therefore  the  wrapped 
measurement  vector  )8  will  not  in  general  fall  in  the  the  subspace  K  +  L  (and  potentially 
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can  be  quite  far  from  it).  In  the  absence  of  measurement  noise,  we  can  unwrap  these 
measurements  by  identifying  those  integer  correction  vectors  e  for  which  )8*  =  ^  +  2ne 
lies  in  fC  +  L.  In  the  presence  of  noise,  on  the  other  hand,  the  unwrapped  vector  will  not 
generally  lie  in  fC  +  L  (but  for  low-to-moderate  noise  will  be  in  the  vicinity).  Thus  we  search 
for  vector(s)  which  are  as  close  to  fC  +  L  as  possible  in  a  weighted  least-squares  sense 
(Lannes  and  Anterrieu,  1999),  i.e.  we  search  for  the  vector 


0RSC 

W 

( 

e 

\ 

trsc  = 

^RSC 

=  argmirie^e^^ 

j3*(e)-M 

4> 

J 

where  W  is  the  weighting  matrix.  If  we  let  E  denote  the  phase  measurement  covariance 
matrix  and  set  W  =  this  is  equivalent  to  searching  for  vectors  e  which  minimize  the 

projection  of  a  whitened  measurement  Wz  jS*  =  W^j6  -|-  27re)  onto  the  space  (fC  +  L)^  := 

1 

A:er((WzM)  ),  where  ker(A)  denotes  the  kernel,  or  nullspace,  of  the  matix  A. 

Specifically  we  seek  to  minimize: 

/(e)  =  ||PwWz()8  +  27re)|f  (3.6) 

where  Pw  is  a  matrix  representing  the  orthogonal  projection  from  R”  onto  {K  +  L)^. 
Letting  e'  =  — e,  we  can  rewrite  the  above  objective  function  as: 

2  2 

/(e')  =  ||PwWz(j8-27re')||  =  |  |PwWz^  -  27rPwWze'|  |  (3.7) 

Lannes  and  Anterrieu  (1999)  showed  that  this  optimization  problem  is  equivalent  to  the 
so-called  closest  vector  problem  (CVP)  in  the  theory  of  lattices.  We  will  define  a  lattice  L{Z") 
as  the  set  of  points  generated  by  integer  combinations  of  the  column  vectors  of  a  matrix 
L.  Letting  P  =  P^yWz,  our  optimization  problem  then  amounts  to  the  following:  Find  the 
lattice  point  in  P(Z'’)  which  is  closest  to  PjS.  Indeed  this  CVP  formulation  applies  in  a  very 
general  sense  to  phase-unwrapping  problems  taken  collectively  as  discussed  in  Wubben 
et  al.  (2011). 

A  compact  representation  of  the  lattice  L  is  given  by: 
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(3.8) 


{m<n—(d+N—3)  'j 

Y2  I  Vflf  G  Z  > 

where  {v,}  are  linearly-independent  and  together  form  a  basis  of  the  lattice. 

Suppose  we  have  foimd  a  basis  for  the  lattice  P(Z”),  and  we  have  solved  the  Closest 
Vector  Problem  for  a  given  measurement  vector  (c.f.  Section  3.2.2).  Let  b*  be  the  output 
of  the  Babai  Nearest  Plane  Algorithm  -  i.e.  it  is  the  lattice  point  which  is  the  closest  to  )8. 
We  now  seek  to  solve  for  the  wrap  vector  corresponding  to  this  lattice  point,  i.e.  we  seek  a 
solution  to: 


b*  =  Pe  (3.9) 

Note  that  P  is  a  (weighted)  projection  matrix  and  thus  not  full-rank,  and  therefore  there 
will  be  mfinitely-many  solutions  to  this  equation.  The  Closest- Vector-Problem  algorithm 
will  provide  one  particular  solution  Cp.  The  complete  set  of  solutions  is  then  given  by: 

e  =  ep  +  eh  (3.10) 

where  is  any  integer  vector  in  the  kernel  of  P.  Suppose  we  choose  one  such  vector 
and  correct  our  phase  measurement  vector  accordingly.  The  corrected  phase  measurement 
vector  can  be  written  as: 

=)8  +  27r(ep  +  e;,)  (3.11) 

Lemma  3.4.1.  E  K L,  Ve;, 

Proof:  The  fact  that  G  ker(Py^W^)  implies  that  W^e/j  G  /cer(Pw).  This  in  turn  implies 
that  Wze;,  G  zm(WzM)  and  hence  that  e/j  G  zm(M)  since  Wz  is  invertible  by  construction. 

□ 

While  any  choice  of  e;,  G  fC  -|-  L  will  admit  a  solution  to  Equation  (3.5),  let  us  consider 
the  optimal  one  eufl  which  minimizes  the  error  in  the  ultimate  phase  solution  Trsc-  Let 
jS*  =  ^  +  2n{ep  -|-  be  the  corresponding,  optimal  unwrapped  measurement  vector. 

/V 

From  Lemma  3.4.1,  we  see  that  the  unwrapped  vector  differs  by  some  Ineh  in  K  L 
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from  )Sq,  i.e. 


'^*  =  ^l+27zeh  (3.12) 

The  impact  of  this  fundamental  ambiguity  is  the  main  subject  of  this  chapter.  We  have 
depicted  the  situation  in  Figure  3.6  to  provide  a  visual  summary  of  the  linear  algebra 
involved.  As  we  have  seen,  the  RSC  process  begins  with  the  interferometric  phase  mea¬ 
surement  jS,  which  due  to  wrapping  will  in  general  lie  far  from  the  range  fC  +  L  of  the 
measurement  matrix.  By  solving  the  Closest- Vector-Problem  using  a  lattice  algorithm  such 
as  the  Bahai  Nearest  Plane  algorithm,  it  is  possible  to  find  a  particular  correction  vector 
Inep  which  when  added  to  jS  minimizes  the  residual  in  Equation  (3.5),  i.e.  the  weighted 

/V 

distance  from  fC  -|-  L.  In  the  noiseless  case,  this  unwrapped  vector  jS  will  lie  in  fC  -|-  L  (i.e. 
zero  residual),  whereas  in  the  noisy  case,  it  will  in  general  remain  outside  of  R  -|-  L?.  In 
either  case,  the  choice  of  the  residual-minimizing  vector  is  not  unique.  To  see  this,  let 
the  smallest  possible  residual  norm  among  all  unwrapped  candidates  be  denoted  as 
The  set  of  unwrapped  measurement  vectors  rmin  away  from  K  +  L  can  be  represented  as 
discrete  points  in  a  plane  parallel  to  fC  -|-  L,  each  of  which  corresponds  to  a  distinct  choice  of 
ep.  Within  this  family,  consider  the  optimum  vector  )3q  whose  corresponding  least-squares 
solution  Trsc  is  minimal.  All  candidate  unwrapped  vectors  are  within  an  error  vector  2nep 
of  )8q,  where  ejj  is  an  integer  vector  in  fC  -|-  T.  RSC  algorithms  are  fundamentally  blind  to 
such  errors;  distinct  unwrappings  and  )Sq  both  produce  solutions  to  Equation  (3.5)  in 
the  noiseless  case,  as  do  their  respective  projections  onto  K  L,  and  in  the 

noisy  case.  The  property  of  wrap-invariance  introduced  in  this  chapter  ensures  that  the 
effect  of  such  an  error  on  the  phase  of  the  resulting  RSC  solution  Trsc  is  either:  merely  a 
benign  integer  multiple  of  27r  (c.f.  Section  3.4.2),  or  a  linear  gradient  in  the  estimated  Fourier 
phases,  which  is  equivalent  up  to  an  extra  image  shift  (c.f.  Section  3.6)  in  the  reconstructed 
image.  In  order  to  develop  conditions  for  wrap-invariance  as  they  relate  to  pattern  design, 
we  must  first  characterize  the  effect  of  the  residual  wrap  error  on  the  RSC  least-squares 

^Without  loss  of  generality,  we  have  selected  ep  =  0  in  Equation  (3.10)  so  as  to  simplify  the  diagram 
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Noiseless 


Noisy 


P  (wrapped) 


Figure  3.6:  Illustration  of  the  fundamental  ambiguity  of  In-periodicity  in  RSC  imaging.  Distinct  unwrap¬ 
pings  p*  and  Pq  both  produce  solutions  to  Equation  (3.5)  in  the  noiseless  case,  as  do  their  respective  projections 
onto  X  +  L,  P*f^_^_i  and  Pq  in  the  noisy  case. 


solution.  This  is  the  aim  of  the  next  section. 


3.4.2  Quantifying  the  Effect  of  the  Fundamental  Ambiguity 

Let  us  now  quantify  the  effect  of  this  unresolved  wrap  error  on  the  ultimate  least-squares 
solution,  which  can  be  accomplished  in  two  easy  steps.  Following  standard  least-squares 
principles,  we  first  find  the  projection  pl^_^_i  of  the  unwrapped  P*  onto  K-\-L  whose  weighted 

A.  5j<  A  5(< 

distance  from  p  is  minimized.  Note  that  the  term  in  j8  is  already  in  X  -|-  L  and  hence 
unchanged  by  projection.  Hence  we  obtain: 


^x+l  =  W-2(I-Pw)WJ^*  =  A 


'0,K+L  + 


(3.13) 


where  Pq  is  the  projection  of  p  onto  K-\-  L.  We  then  solve  the  system: 


M 


6 


'K+L 


(3.14) 


As  aforementioned,  M  is  rank-deficient  (by  3),  and  hence  there  will  be  infinitely-many 
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solutions  to  this  system.  Successful  recovery  of  the  Fourier  phases  modulo  7.n  requires  a 
solution  preserving  the  integrality  of  the  error  term  e^.  In  this  case,  we  obtain  a  final  RSC 
solution  2  TIC  away  from  the  true  solution  for  some  integer  vector  c.  To  achieve  this,  we  rely 
on  the  integer  matrix  decomposition  known  as  the  Smith  Normal  Form,  which  is  described  in 
the  following  Theorem  and  Lemma: 

Theorem  3.4.2.  (Smith  Normal  Form)  (Smith,  1861):  Let  A  be  a  nonzero  m  x  n  integer  matrix 
with  rank  r.  There  exist  integer  and  unimodular  ^  (and  thus  invertible)  matrices  m  x  m  and  n  x  n 
matrices  U  and  V  respectively  such  that  the  matrix  product  D  =  UAV  is  a  diagonal  matrix  whose 
diagonal  entries  D,,  (the  so-called  elementary  divisors)  are  non-zero  integers  for  i  <  r,  and  zero  for 
i  >  r. 

Theorem  3.4.3.  (Elementary  Divisors)  (Smith,  1861):  The  product  of  the  elementary  divisors  is  the 
greatest  common  divisor  (gcd)  of  all  r  x  r  minors  of  A. 

The  proof  of  the  Theorems  above  can  be  found  in  textbooks  such  as  Newman  (1972).  □ 

Let  us  compute  the  Smith  Normal  Form  (SNF)  {U,  D,  V}  of  our  matrix  M.  Let  Um  = 
and  Vm  =  so  that  we  can  write: 


M  =  UmDmVm 

where  the  r  diagonal  elements  {d,}  of  are  the  elementary  divisors  of  M. 
We  can  now  re-write  Equation  (3.14)  above  as: 


(3.15) 


Let  us  choose  the  following  particular  solution  to  Equation  (3.16): 

TRSC  =  ^K+L 

unimodular  matrix  is  one  whose  determinant  is  ±1 


DmV 


M  VM 


e 

(p 


(3.16) 


(3.17) 
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where  denotes  the  pseudo-inverse  of  D.  The  resulting  error  is  then: 

esse  =  V^iD+ (3.18) 

Lemma  3.4.4.  Let  u  =  U^^e)^  .  The  residual  wrap  error  brsc  equals  0  mod  Itz  if  and  only  if 
mod  {ui,di)  =  0,Vz  <  r.  The  proof  of  this  Lemma  is  an  adaptation  of  a  standard  proof  which  can  be 
found  in  most  textbooks  covering  linear  Diophantine  equations  (see,  e.g.  Newman  (1972)). 

□ 

From  this  Lemma,  the  following  Corollary  is  clear: 


Corollary  3.4.5.  (Sufficient  condition  on  SNF  of  RSC  matrix  for  wrap-invariance):  If  the 

elementary  divisors  of  the  measurement  matrix  M  corresponding  to  a  certain  aperture  pattern  are  all 
1,  the  RSC  solution  defined  by  Trsc  (s  immune  to  phase-wrapping  error. 

□ 

RSC  patterns  consisting  of  apertures  placed  randomly  on  a  Cartesian  grid  appear  to 
satisfy  this  sufficient  condition  with  high  probability.  We  conducted  a  simple  experiment 
in  which  15  apertures  were  randomly  placed  on  a  10  x  10  grid.  Out  of  256  placements,  66 
were  valid  RSC  patterns  (i.e.  possessed  at  least  critical  redundancy),  and  of  these,  only  2 
had  non-unity  elementary  divisors. 

3.4.3  Relation  to  closure-phase  approaches 

The  SNF  has  been  been  applied  to  the  RSC  phase  problem  before  (Lannes,  2003).  Whereas 
we  have  chosen  to  apply  SNF  directly  to  the  baseline  measurement  matrix,  the  approach 
taken  by  Lannes  (2003)  is  to  instead  treat  the  piston-invariant  phases  of  the  bispectra  (the 
so-called  closure  phases)  as  the  fundamental  observables  from  which  the  object  phases  can  be 
inferred  via  the  relation: 


Co— >C  e  =  Jcl+lTZecl 


(3.19) 
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where  y^/  are  the  wrapped  closure  phases,  is  the  corresponding  wrap  vector,  and  Co->c  is 
the  matrix  mapping  the  distinct  object  phases  in  the  array  to  closure  phases.  Lannes  (2003) 
hence  applies  the  SNF  to  the  closure  matrix  Cg^c-  By  direct  analogy  to  Corollary  3.4.5, 
note  that  if  the  elementary  divisors  of  Cqc  are  all  1,  then  the  RSC  solution  is  immune  to 
phase-wrapping  error.  Using  closure  phases  as  observables  can  be  advantageous  in  low-light 
scenarios  in  which  there  is  not  sufficient  SNR  in  a  single  atmospheric  coherence  time  to 
reliably  measure  the  baseline  phases.  To  overcome  this  low  per-frame  SNR,  atmosphere- 
invariant  observables  such  as  the  bispectra  can  be  integrated  over  many  frames  to  build 
sufficient  SNR,  and  their  respective  closure  phases  used  as  reliable  phase  measurements. 
Since  the  baseline  phases  are  known  modulo  2n,  the  linear  combinations  of  them  which 
comprise  the  closure  phases  are  also  known  modulo  Ztt.  In  order  to  relate  this  condition 
to  Corollary  3.4.5,  let  us  first  define  another  closure  matrix  C^^c  which  instead  maps  the 
phase  measurements  to  closure  phases.  This  mapping  consists  of  equations  of  the  form: 


1/123  =  f^l2  +  /^23  —  ^13 


(3.20) 


where  1/123  is  the  closure  phase  associated  with  apertures  1,  2,  and  3,  and  the  j6,y  are  the 
associated  baseline  phases  (see  Equation  (3.2)).  Of  the  (3)  possible  closure  phases,  at  most 
(”2  can  be  linearly-independent  (see  e.g.  Readhead  et  al.  (1988)).  One  commonly-chosen  set 
of  such  linearly-independent  relations  consists  of  all  the  closure  triangles  involving  a  given 
aperture  A,  and  this  is  the  set  selected  by  Larmes  (2003).  Qm^c  is  therefore  an  (”2')  X  (2) 
matrix.  Lannes  (2003)  accordingly  provides  a  convenient  grouping  of  the  baselines  into  two 
categories:  (1)  spanning  tree  baselines  which  connect  aperture  A  to  all  other  apertures,  and 
(2)  loop  entry  baselines  which  provide  the  closure  for  these  spanning  tree  baselines.  This 
categorization  is  depicted  in  Figure  3.7. 

Given  this  categorization,  we  can  decompose  into  corresponding  blocks  as: 


C 


m— >-c 


(3.21) 


where  contains  the  sparming  tree  contributions  to  the  matrix  (which  appear  in 
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Figure  3.7:  Distinction  between  spanning  tree  baselines  (thick,  solid)  and  loop  entry  baselines  (thin,  dotted) 


multiple  closures),  and  is  the  ("2')  X  (”2')  identity  matrix  representing  the  loop-entry 
contributions  (each  of  which  appears  in  only  one  closure).  The  following  property  follows 
from  this  block  form  expression: 

Lemma  3.4.6.  The  elementary  divisors  ofCm^c  all  1. 

Proof:  Since  we  have  chosen  a  linearly-mdependent  subset  of  closure  relations,  r  = 
rank{Cm^c)  =  (”2^)-  There  exists  ar  x  r  minor  (namely,  I(n-i))  which  is  equal  to  1.  Therefore 
the  gcd  of  all  r  X  r  minors  is  1,  and  therefore  from  Theorem  3.4.3  (Elementary  Divisors),  all 
elementary  divisors  must  be  1.  □ 

Let  us  now  relate  Cm^c  to  the  matrix  Cg^c  used  by  Lannes  (2003).  Recall  from  the 
discussion  of  bispectra  in  Section  3.3  that  the  closure  relations  eliminate  piston  differences 
in  the  measurements  so  that  Cm^c  annihilates  the  subspace  L,  i.e.  the  space  spanned  by 
the  columns  of  M  corresponding  to  Defining  Mg  as  the  submatrix  of  M  containing  the 
columns  corresponding  to  9,  we  have 


e 

0 

Cm— >-ch4 

= 

Co— fc  0 

(p 

(p 

where  Cg^c  =  Cm^c^e-  Cg^c  is  an  (”2  ^  ^  matrix  which  is  rank-deficient  by  two 

Then  by  direct  analogy  to  Equation  (3.14),  we  can  obtain  valid  RSC  object  phase  solutions 
by  solving: 

^To  see  this,  note  that  each  solution  set  to  Equation  (4.49)  above  remains  valid  after  replacing  each  with 
Ofj  =  Oij  -  z  ■  (p  -  r,) 
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Cm^cM  ^  =[co->.  ol  ^  =y:i+27Tel,i  (3.23) 

cp  j  L  J  ^  ^ 

where  y*/  is  the  true  unwrapped  closure  vector  and  iTzef^^i  is  the  residual  integer 
wrapping  error  vector  after  applying  the  Bahai  NP  algorithm  to  solve  the  CVP  problem 
associated  with  matrix  and  Irced.  Note  that  if  we  find  a  vector  9*  satisfying 

Equation  (3.23),  it  will  clearly  also  satisfy  the  relation  Cg^cO*  =  y*/  +  Lannes 

(2003).  Note  furthermore  that  we  can  solve  the  equation  above  in  two  separate  integer¬ 
preserving  steps  of  the  form  of  Equation  (3.17),  the  first  involving  the  SNF  decomposition 
of  Cm^c,  and  the  second  involving  that  of  M.  Since  the  elementary  divisors  of  Cm^c  are  all 
1  by  construction  (by  Lemma  3.4.6)  and  hence  the  first  step  is  thus  integer-preserving,  wrap- 
invariance  again  amounts  to  whether  or  not  all  elementary  divisors  of  M  are  1.  Therefore 
we  have  the  following  Proposition  relating  wrap  invariance  for  closure  measurements  to 
that  for  raw  phase  measurements: 

Proposition  3.4.7.  (Sufficient  condition  for  wrap -invariance  of  closure-based  RSC):  If  the 

elementary  divisors  of  the  phase  measurement  matrix  M  are  all  1,  then  the  closure-based  RSC  solution 
will  be  wrap-invariant. 

□ 

We  remark  in  passing  that  although  the  preceding  analysis  was  presented  in  the  context 
of  the  traditional  three-aperture  closure,  it  applies  directly  to  the  case  of  closures  involving 
an  arbitrary  number  of  sides.  As  an  example,  consider  the  pattern  shown  in  Figure  3.8.  A 
spanning  tree  for  the  pattern  consisting  of  the  short  baselines  in  an  array  is  depicted.  Let 
{(psp}  denote  the  aperture  phase  differences  in  these  n  —  1  spanning  tree  baselines.  Note 
that  all  aperture  phase  differences  in  the  array  can  be  expressed  as  linear  combinations 
of  the  {(psp}-  If  the  aperture  phase  differences  are  known  reliably  via  measurements  of 
the  spanning  tree  baselines,  we  can  use  these  measurements  to  cancel  the  aperture  phase 
differences  in  all  other  measurements  (of  which  one  example  is  shown  in  green).  The 
idea  of  using  such  generalized  closure  phases  (Martinache,  2010)  is  indeed  the  mathematical 
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Figure  3.8:  Bootstrapping  phase  of  a  lozv-SNR  baseline  (green)  with  subset  (blue)  ofhigh-SNR  baselines  from 
spanning  tree  baselines  (black) 

foundation  for  the  promising  technique  known  as  baseline  bootstrapping  in  which  high-fidelity 
phase  measurements  of  several  high-SNR  baselines  (typically  the  short  baselines)  to  cancel 
the  atmosphere  on  each  lower-SNR  (and  hence  lower  fidelity)  baseline. 

Note  that  for  an  arbitrary  n-aperture  pattern,  there  will  in  general  be  n  —  1  spanning  tree 
baselines  and  thus  (2)  —  {n  —  1)  =  generalized  closures,  each  involving  a  distinct  clos¬ 
ing  (or  loop-entry)  baseline.  Therefore  the  resulting  measurement  matrix  can  be  expressed 
exactly  as  in  Equation  (3.21)  above  and  hence  the  preceding  analysis  holds. 

While  this  section  has  considered  a  few  possibilities  for  phase  observables,  relating 
mathematical  conditions  for  wrap  invariance  to  a  physical  condition  on  aperture  placement 
is  more  intuitive  when  considering  the  raw  phase  measurements  as  opposed  to  their  closures. 
For  this  reason,  for  the  remainder  of  the  chapter  we  will  work  directly  with  the  baseline 
phases  and  their  associated  wrapping  errors.  In  particular,  we  will  begin  by  connecting 
these  wrapping  errors  with  analogous  ambiguities  in  recently-developed  phasor-based 
approaches. 


3.5  Implications  of  Wrap  Ambiguities  on  Pattern  Design 

In  this  section  we  use  the  mathematically-sufficient  conditions  for  wrap  invariance  from 
the  previous  section  to  show  that  aperture  patterns  whose  interferometric  graph  satisfies  a 
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certain  loop-free  condition  are  wrap-invariant.  Here  we  define  the  interferometric  graph  in 
the  standard  way:  it  is  simply  the  graph  formed  by  connecting  the  array's  apertures  (the 
nodes  of  the  graph)  with  edges  representing  the  array's  baselines. 

This  condition  is  founded  on  the  Theorem  3.4.3  (Elementary  Divisors)  in  Section  3.4  and 
the  following  definition  of  the  matrix  determinant.  This  definition  is  given  in  many  linear 
algebra  texts  (see  e.g.  Bretscher  (2001)). 

Definition  3.5.1.  Matrix  Determinant  in  terms  of  patterns:  Suppose  we  have  an  n  x  n 
matrix  A.  Define  a  pattern  as  a  selection  of  n  entries  of  the  matrix  in  which  there  is  only  one 
chosen  entry  in  each  row  and  one  in  each  column  of  the  matrix.  Furthermore,  we  denote  a 
pair  of  numbers  in  a  pattern  as  inverted  if  one  of  them  is  located  above  and  to  the  right  of 
the  other.  Then  we  can  obtain  the  determinant  of  A  by  summing  the  products  associated 
with  all  patterns  with  an  even  number  of  inversions  and  subtracting  the  products  associated 
with  all  the  patterns  with  an  odd  number  of  inversions. 

Consider  ar  x  r  sub-matrix  M/  consisting  of  a  set  I  of  linearly-independent  rows  in  M 
and  d  +  N  —  3  linearly-independent  columns.  Our  goal  will  be  to  find  conditions  under 
which  such  a  sub-matrix  contains  only  one  pattern  with  a  non-zero  product,  in  which  case 
the  determinant  will  be  ±1  from  the  definition  above.  This  will  guarantee,  by  the  Theorem 
3.4.3  (Elementary  Divisors)  of  the  previous  section,  that  the  elementary  divisors  of  M  are  all 
1,  which  will  in  turn  ensure  that  the  error  in  the  RSC  solution  Trsc  will  be  0  mod  2n  by 
Corollary  3.4.5. 

Consider  one  such  Mj  and  note  that  within  the  fully-connected  interferometric  graph 
associated  with  M,  we  can  identify  a  sub-graph  G  containing  only  the  measurements  in  Mj. 
This  will  be  done  by  sequentially  identifying  those  matrix  entries  which  must  be  part  of  a 
pattern  with  a  non-zero  product.  Note  that  some  of  these  special  entries  from  the  matrix  Mj 
can  be  identified  immediately.  Namely,  all  non-redundant  measurements  contain  a  singleton 
±1  in  the  column  associated  with  their  object  phase.  All  non-zero  patterns  must  clearly 
contain  this  ±1  and  so  we  can  select  these  singleton  object-phase  entries  as  guaranteed 
participants  in  a  non-zero  pattern.  Moreover,  all  measurements  containing  a  leaf  node  (i.e.  a 
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node  with  a  single  connection)  in  G  contain  a  singleton  ±1  in  the  column  associated  with 
their  leaf  node.  All  non-zero  patterns  must  clearly  contain  this  ±1  as  well.  Thus  we  can  also 
select  these  leaf  node  entries  as  guaranteed  participants  in  a  non-zero  pattern. 

There  may  be  cascading  implications  of  such  singleton  measurements.  To  illustrate  this, 
consider  the  scenario  shown  in  Figure  3.9.  A  simple  RSC  array  is  shown  on  the  left.  A  subset 
of  the  baselines  in  one  possible  linearly-independent  sub-matrix  Mj  is  depicted.  Here  we 
intentionally  defer  selection  of  the  arbitrarily-set  phases  until  later  for  purposes  of  generality. 
A  simplified  depiction  of  the  matrix  Mj  is  shown  in  which  all  non-zero  entries  have  been 
colored  black  and  all  zero  entries  have  been  colored  white  for  simplicity.  In  Step  A  of  the 
reduction  process,  object  phase  65  is  selected  for  participation  (i.e.  its  matrix  entry  factored 
as  common  to  all  non-zero  patterns)  since  it  is  a  singleton  object  phase.  Its  corresponding 
row  (i.e.  row  5)  in  M/  is  then  eliminated  from  participation  since  the  remaining  entries  in 
this  row  cannot  participate  in  a  pattern  (by  definition  of  a  pattern).  In  Step  B,  the  aperture  6 
entry  (p(,  in  row  5  is  selected  since  it  has  become  a  leaf  node  in  the  pattern,  and  row  5  can 
then  be  eliminated.  Then  in  Step  C,  object  phase  64  is  then  selected  by  virtue  of  becoming  a 
singleton  object  phase,  and  row  4  is  then  eliminated.  This  selection/ elimination  process  can 
be  repeated  beyond  the  steps  shown  in  the  Figure,  until  either  no  leaf  nodes  and  singleton 
object  phases  remain,  or  there  are  no  baselines  left  to  eliminate.  We  formalize  the  pattern 
reduction  process  in  the  listing  Algorithm  2. 

We  can  see  that  any  baseline  in  an  interferometric  graph  which  does  not  belong  to  a  loop 
will  be  eliminated  in  the  reduction  process,  and  its  corresponding  matrix  entries  factored. 
The  only  structures  in  the  graph  that  persist  after  this  reduction  are  sets  of  loops  with  a 
certain  property.  Namely  we  define  a  persistent  loop  set  as  a  set  of  loops  that  contains  at  least 
two  instances  of  every  baseline  contained  in  the  set.  (A  set  can  consist  of  any  number  of 
loops,  including  one).  With  this  definition,  absolute  invariance  may  be  possible  if  the  graph 
of  the  redundant  baselines  does  not  contain  any  persistent  loop  sets.  Algorithm  2  returns 
PERSISTENT  if  persistent  loops  exist  and  LOOPEREE  if  the  pattern  is  completely  reduced 
and  therefore  free  of  persistent  loops. 
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Figure  3.9:  Example:  Reducing  an  aperture  pattern  and  associated  matrix  to  identify  Persistent  Loop(s) 


Algorithm  2  Pattern  Reduction  Algorithm 

Require:  R  {where  R  is  the  set  of  the  baselines  corresponding  to  Mj,  where  I  denotes  the 
indices  of  a  linearly-independent  subset  of  d  +  N  —  3  rows  of  M} 

while  \R  I  >  0  do 

1.  Leaf  Nodes 

1.1  remove  any  remaining  baselines  containing  leaf  nodes  from  R 

1.2  add  the  associated  apertures  to  the  list  N 

2.  Singleton  Object  Phases 

2.1  remove  any  remaining  baselines  containing  singleton  object  phases  from  R 

2.2  add  the  associated  object  phases  to  the  list  O 

if  no  baselines  removed  in  the  current  iteration  then 
return  PERSISTENT 
end  if 
end  while 
return  LOOPEREE 
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Note  that  in  the  latter  case,  we  will  have  eliminated  r  rows  from  Mj.  Since  each  baseline 
elimination  is  associated  with  object  phase  or  aperture  selection  from  distinct  columns,  and 
M/  contains  r  +  3  columns,  there  will  be  exactly  three  extraneous  columns  not  involved 
in  the  reduction  process.  The  r  x  r  submatrix  obtained  by  selecting  the  non-extraneous 
columns  (i.e.  those  corresponding  to  the  selected  object  phases  and  leaf  nodes  in  O  and 
N,  respectively)  will  then  have  a  determinant  of  ±1  by  virtue  of  having  a  single  non-zero 
pattern  revealed  by  the  reduction  process.  Having  ensured  the  existence  of  a  unit  r  x  r 
minor,  the  gcd  of  all  r  x  r  minors  must  be  1.  We  have  hence  confirmed  the  elementary 
divisors  must  be  all  1,  and  thereby  ensured  that  the  pattern  is  wrap-invariant  (c.f.  Theorem 
3.4.3  (Elementary  Divisors),  and  Corollary  3.4.5).  We  summarize  the  sufficient  condition  as 
follows: 

Proposition  3.5.1.  (Sufficient  conditions  on  aperture  pattern  for  wrap-invariance):  Consider 
the  graph  of  an  aperture  pattern  which  contains  d  distinct  baselines  and  any  set  ofN  —  3  linearly- 
independent  redundant  baselines.  If  this  graph  does  not  contain  persistent  loop  sets  (in  the  sense 
defined  above),  the  matrix  Mj  formed  by  these  independent  measurements  will  have  determinant  ±1. 
As  a  result  Corollary  3.4.5  will  hold,  thereby  guaranteeing  that  RSC  solution  trsc  invariant 

to  wrapping  of  the  phase  measurements. 

We  have  hence  arrived  at  a  a  physical  definition  of  a  wrap-invariant  pattern.  We  now 
apply  Algorithm  2  to  the  example  pattern  shown  in  Figure  3.2.  Algorithm  2  reduces  the 
pattern  to  the  persistent  loop  set  shown  in  Figure  3.10. 

The  elementary  divisors  of  the  pattern's  measurement  matrix  are  not  all  1;  they  are  all  1 
except  for  a  singleton  3  and  hence  detijili)  mod  3  =  0  for  all  choices  of  M;. 

Having  traced  the  distortion  induced  by  phase  wrapping  to  physical  property  of  the 
array  itself,  we  now  return  to  Figure  3.3.  The  lower  left  panel  shows  the  reconstruction  result 
with  the  SNF-based  Phase  method  described  in  Section  3.4.  The  closure  phase  approach 
yields  the  same  corruption  in  reconstruction,  as  the  elementary  divisors  of  Coc  are  also  all  1 
except  for  a  singleton  3. 

There  are  several  simple  ways  to  amend  this  pattern  so  that  it  is  wrap-invariant.  While 
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Persistent  Loop 


Figure  3.10:  Persistent  Loop  set  at  center  of  pattern  in  Figure  3.2 

the  most  intuitive  of  these  involve  moving  the  apertures  involved  in  the  persistent  loop 
shown  in  Figure  3.10,  these  approaches  leave  gaps  in  the  UV-sampling  pattern.  An  alternate 
approach  that  preserves  the  UV-sampling  is  to  add  an  aperture  to  the  center  of  the  pattern 
as  shown  in  Figure  3.11.  This  results  in  additional  linear ly-independent  redundancies 
colored  in  blue  and  green,  respectively,  in  the  Figure.  These  additions  replace  baselines  in 
the  persistent  loop,  allowing  this  loop  to  be  broken.  With  wrap-invariance,  reconstruction 
results  match  the  true  image  in  both  the  phase  and  phasor  approaches  as  respectively  shown 
in  Figure  3.12.  In  the  top  row,  reconstruction  results  are  displayed  for  the  phase  (left)  and 
phasor  (right)  approaches  for  the  noiseless  case.  The  image  distortion  present  in  Figure 
3.3  has  been  completely  eliminated  by  tweaking  the  pattern  so  that  it  is  wrap-invariant. 
Analogous  results  for  an  SNR  of  25  dB  are  displayed  in  the  bottom  row.  Flere  we  define 
SNR  as  the  ratio  of  the  phasor  magnitude  at  visibility  1  (i.e.  zero  spatial  frequency)  to  the 
standard  deviation  of  the  noise,  which  we  have  assumed  to  be  complex  Gaussian  and  i.i.d. 
across  spatial  frequency  for  this  simulation. 


3.6  Wrap-invariance  and  Practical  RSC  Calibration 

In  previous  sections  we  established  that  the  Phase  Approach  can  be  made  robust  to  phase¬ 
wrapping  using  the  Smith  Normal  Form  (SNF)  and  algorithms  from  lattice  theory.  Moreover, 
the  SNF  provided  a  mathematical  framework  for  the  notion  of  wrap-invariance.  From  a 
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Figure  3.11:  Amended  Pattern 
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Figure  3.12:  Reconstruction  Results  for  Amended  Pattern 
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practical  standpoint,  however,  computation  of  the  Smith  Normal  Form  is  likely  to  become 
a  computational  burden  for  large  arrays,  as  in  those  under  current  consideration  with 
Nap  ~  10^  and  n  ^  10^  baselines  (Zheng  et  al.,  2014).  Techniques  not  requiring  such  a 
computation  are  hence  of  strong  practical  interest.  In  this  section  we  show  that  the  wrap- 
invariance  property  checked  by  Algorithm  2  provides  a  certificate  for  reliable  reconstruction 
with  such  techniques,  in  the  presence  of  the  fundamental  27r-periodicity  of  interferometric 
measurements. 

3.6.1  Practical  Phase  Approaches 

Our  approach  here  will  rely  upon  the  well-known  Singular  Value  Decomposition  (SVD)  of 
the  measurement  matrix  M,  which  is  given  by: 

M  =  (3.24) 

in  which  XJg-  and  V^-  are  m  x  m  and  {d  +  N)  x  {d  +  N)  orthogonal  matrices,  respectively. 
Ecr  is  a  m  X  (d  -|-  N)  diagonal  matrix  with  r  non-zero  diagonal  entries  (the  so-called  singular 
values  o/M),  where  r  =  rank{M)  =  d  +  N  —  3. 

Lemma  3.6.1.  The  final  3  columns  of  V  a  form  a  basis  for  the  nullspace  o/M. 

Proof:  This  follows  from  the  fact  M  is  rank-deficient  by  3,  and  standard  properties  of  the 
right  singular  vectors  comprising  Yg-  in  the  SVD.  (Bretscher,  2001)  □ 

Now  recall  that  in  Section  3.3,  we  provided  an  one  particular,  SNF-based,  solution  to 
Equation  (3.14).  Here  we  instead  consider  the  complete  set  of  solutions  to  this  Equation, 
which  is  given  by: 


=  M+  +  iTzel)  +  To  (3.25) 

where  Tq  is  any  vector  in  the  nullspace  of  M,  and  M+  denotes  the  pseudo-inverse  of  M, 
whose  matrix  elements  are  derived  directly  from  the  SVD  above. 
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(3.26) 


M+  =  V,L+Uj 

where  Ejr  is  a  {d  +  N)  x  m  diagonal  matrix  whose  r  non-zero  diagonal  entries  are  the 
reciprocals  of  the  corresponding  non-zero  entries  in  "Lg-. 

Typically  Phase  Approach  techniques  implicitly  select  a  particular  solution  from  the  fam¬ 
ily  of  solutions  in  Equaton  (4.61)  via  augmenting  the  matrix  M  with  additional  constraints. 
The  most  common  among  these  (Wieringa,  1992)  (Wijnholds  and  Noorishad,  2012)  is  to 
enforce  that:  E  (^  =  0,  ^  ^/r,  =  0,  where  r,  is  the  vector  position  of  the  z-th  aperture  in  the 
array. 

Note  that  the  error  resulting  from  application  of  this  pseudo-inverse  to  the  unwrapped 
measurement  vector  will  be  given  by: 

2716^  =  M+ (2716^)  (3.27) 

Recall  that  the  error  vector  eg-  has  two  parts:  one  for  the  error  in  the  atmosphere/ piston 
(which  we  will  denote  with  the  index  L),  and  one  for  the  error  in  the  Fourier  phases  (which 
we  denote  with  the  index  K),  i.e.: 


eg  =  [eg^ueg^K]'^  (3.28) 

We  will  focus  attention  on  the  latter,  since  it  is  of  direct  relevance  for  image  formation. 
Let  us  express  the  spatial  frequencies  measured  by  an  array  as  two-element  vectors  of  the 
form  {cOx,  (X}y).  Let  X  be  the  dxl  matrix  containing  these  spatial  frequencies.  Note  then  that 
the  phase-wrap  error  will  manifest  itself  merely  as  an  image  shift  if  and  only  if  this  error  is 
a  (modulo-27r)  phase  ramp,  i.e.  there  exists  a  2-element  shift  vector  z  and  an  integer  vector 
k  which  together  satisfy: 


2Tieg,K  -  2nXz  =  27rk  (3.29) 

Substituting  from  Equation  (C.40)  we  obtain: 
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M+  {2ne)  -  IttXz  =  27rk  (3.30) 

where  denotes  the  sub-matrix  of  M+  formed  by  the  rows  associated  with  K. 

Dividing  through  hy  2tt  we  obtain  the  equation:  —  Xz  =  k.  Note  that  each 

element  of  can  be  expressed  as  some  rational  number  Similarly  we  first  assume 
X  contains  rational  spatial  frequencies  with  greatest  common  denominator  qx-  Then  we 
can  multiply  through  by  the  least-common-multiple  (LCM)  of  the  {qi}  and  qx  to  obtain  a 
system  of  equations  whose  coefficients  are  guaranteed  to  be  integer  (i.e.,  we  have  a  linear 
Diophantine  system).  Let  this  LCM  be  denoted  as  /.  Then  we  have,  after  rearranging  terms, 

lXz  =  l{M^el-k)  (3.31) 

We  now  wish  to  determine  conditions  under  which  there  exist  vectors  k  and  z  satisfying 
this  overdetermined  Diophantine  system.  Applying  the  Smith  Normal  Form  decomposition 
(c.f.  Theorem  3.4.2)  to  the  matrix  IX  this  time,  and  noting  that  rank(X)  =  2,  we  have: 


Dx  =  Ux(/X)Vx  (3.32) 

where  Ux  and  Vx  are  unimodular  matrices  of  size  m  x  m  and  2x2,  respectively,  and 
Dx  is  a  rectangular  diagonal  matrix  whose  entries  are  zero  below  row  2. 

If  we  left-multiply  Equation  (C.43)  by  Ux  on  both  sides,  we  obtain: 

/UxXz  =  /Ux(M+e;;-k)  (3.33) 

Using  Equation  (C.44)  and  the  fact  that  Vx  is  a  unimodular  (and  hence  invertible)  matrix, 
we  can  then  write: 


DxV-iz  =  /(UxM+e^-Uxk) 


(3.34) 


We  are  now  in  position  to  prove  the  main  result  of  this  section,  which  is  preceded  by  the 
following  Eemma: 
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Lemma  3.6.2.  Given  wrap-invariance,  the  vector  UxM^e  has  integer  entries  below  row  2. 
Proof:  (see  Appendix  B) 


□ 


Proposition  3.6.3.  If  a  pattern  is  wrap-invariant  (in  the  sense  of  Section  3.4),  reconstruction  error 
induced  by  phase  wrapping  is  limited  to  an  image  shift. 

Proof:  We  re-arrange  the  Equation  (C.46)  above  so  that  it  reads: 

yDxVx^z  -  UxM+e^  =  -Uxk  (3.35) 

Let  V  =  jDxVx^z  -  UxM+e^.  Note  that  since  Dx  is  zero  below  row  2,  the  entries  of  v 
below  row  2  will  be  equal  to  those  of  (— UxM^e^),  which  are  integers  by  Lemma  3.6.2.  Now 
consider  the  first  and  second  entries  of  v.  Let  f  be  the  vector  containing  the  fractional  parts 
of  the  first  two  elements  of  vector  UxM^e)^,  and  let  A  be  the  invertible  matrix  consisting 
of  the  first  two  rows  of  yDxVj^^.  Without  loss  of  generality,  choose  z*  =  A^^f  so  that  the 
fractional  part  f  is  annihilated,  leaving  only  integer  elements  in  the  first  two  entries  of  v. 
Hence  we  now  have: 


v  =  -Uxk  (3.36) 

with  v  ensured  to  contain  only  integer  elements.  Since  Ux  is  unimodular,  the  vector 
k*  =  — U^^v  will  be  integral.  We  have  thus  found  a  pair  (z*,k*)  with  integer  k*  which 
satisfies  the  Equation  (C.46).  Since  Equation  (C.46)  is  related  to  Equation  (C.43)  via  a 
unimodular  (and  hence  invertible)  mapping  Ux,  invariance  is  hence  proven.  □ 

With  the  previous  result,  we  have  characterized  the  complete  family  of  Phase  Approach 
solutions  given  in  Equation  (4.61).  Namely  we  have  shown  that,  for  a  wrap-invariant  pattern, 
the  family  differs  by  at  most  an  image  shift  from  the  true  solution.  Returning  to  our  running 
example  in  Eigure  3.11,  we  verified  that  different  solution  choices  from  the  family  given  in 
Equation  (4.61)  simply  resulted  in  shifts  of  an  otherwise  pristine  image  in  the  reconstruction. 
On  the  other  hand  with  wrap-variant  pattern  in  Eigure  3.2,  image  distortion  of  the  severity 
of  Eigure  3.3  was  again  observed,  as  expected. 


63 


3.6.2  Practical  Phasor  Approaches 

Though  traditional  treatments  employ  the  phase  approach  of  the  previous  section  which 
operates  on  baseline  phases,  recent  papers  (e.g.  Marthi  and  Chengalur  (2014),  Liu  et  al. 
(2010))  have  shown  that  approaches  which  operate  at  the  phasor  level  can  be  superior 
in  accuracy.  Liu  et  al.  (2010)  developed  a  Gauss-Newton-type  Non-linear  Least-Squares 
(NLS)  solver  and  showed  it  produced  unbiased  phase  estimates,  in  contrast  with  the 
biased  ones  provided  by  the  phase  approach.  Marthi  and  Chengalur  (2014)  and  Wijnholds 
and  Noorishad  (2012)  have  also  proposed  low-complexity  phasor-based  approaches  and 
demonstrated  performance  near  the  Cramer-Rao  Bound.  Though  the  capacity  of  the  Phasor 
approaches  to  produce  superior  accuracy  relative  to  Phase  approach  has  been  demonstrated, 
the  former's  convergence  issues  can  be  mitigated  via  initialization  with  the  results  of  the 
latter  (see  e.g.  Liu  et  al.  (2010),  and  Zheng  et  al.  (2014)). 

The  implementations  of  the  Phasor  Approach  typically  employ  the  following  measure¬ 
ment  model: 


=  +  (3.37) 

where  1/^  is  the  complex  visibility  observed  between  apertures  i  and  j,  gi  =  \gi\et‘t’’  and 
gj  =  \gj\e^‘^>  are  the  complex  gains  of  these  apertures,  fjj  is  the  true  complex  visibility 
measured  by  this  pair,  and  n  is  complex  measurement  noise.  Note  that  the  phase  difference 
between  gi  and  gj  is  simply  the  optical  path  difference  between  apertures  i  and  j  introduced 
in  the  previous  section.  Given  this  model,  NLS  approaches  attempt  to  find  a  set  of  complex 
phasors  {g,}  and  {/}  which  minimize  an  objective  function  of  the  form: 

^  (3.38) 

i  i>i 

Minimization  of  A  with  respect  to  the  unknowns  (i.e.  distinct  object  and  antenna 
complex  gains)  can  be  accomplished  with  iterative  application  of  the  following  updates, 
as  reported  by  Marthi  and  Chengalur  (2014)  in  the  context  of  radio  interferometry,  and  by 
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Lacour  et  al.  (2007)  in  the  context  of  optical  interferometry: 


^  Lj>kgkgj^kj 

Lj>kZOkj\gkW 

where  the  {/;,}  are  the  true  complex  visibilities  of  the  distinct  object  phases  in  the  array 


(3.39) 


(3.40) 


Due  to  the  circularity  of  these  definitions,  these  equations  must  be  solved  iteratively. 
Starting  from  an  initial  guess  for  all  phasors.  Equation  (3.39)  is  solved  to  obtain  a  better 
estimate  for  the  {gk}  and  then  these  {gk}  are  used  to  obtain  refined  estimates  of  the  {/;,} 
through  Equation  (3.40).  In  the  next  iteration,  these  {/f,}  are  used  to  further  refine  {gk},  and 


Though  there  are  other  means  of  minimizing  objectives  of  the  form  A  (Wijnholds  and 
Noorishad,  2012)  (Liu  et  al.,  2010),  we  omit  discussion  of  them  here;  our  present  purpose 
is  to  characterize  the  correctness  of  the  solutions  themselves,  regardless  of  how  they  are 
obtained.  As  has  been  noted  before  (see  e.g.  Lannes  and  Anterrieu  (1999)),  there  are  strong 
cormections  between  phase-  and  phasor-based  approaches.  To  see  this,  let  z  be  the  vector  of 
products  {gig}f\i-j\}  which  minimize  A.  We  rewrite  Equation  (3.38)  as: 


^  =  LL 1 1  (^7  -  ) (^7  -  4') 1 1  (3-41) 

i  j>i 

Define  r,y  =  for  an  arbitrary  integer  and  r  as  the  vector  containing  the  r,y 

Note  that  r  also  minimizes  A  since  the  rotations  do  not  change  the  values  of 

the  residuals  in  A.  Hence  any  set  of  rotated  phasors  {g,}  and  {f\i-j\}  whose  products 
produce  the  vector  r  will  also  minimize  A.  Note  that  the  set  of  such  valid  phase  vectors 
(i.e.  the  concatenations  of  possible  {z^gi}  and  {Z/|,_y|})  includes  the  complete  family  of 
Phase-approach  solutions  p,  in  Section  3.6.1  with  =  Zr  (where  Zr  is  the  vector  of  the 
phases  of  the  complex  vector  r).  In  other  words,  the  valid  phase  component  of  the  phasor 
approach  solutions  is  not  unique,  and  the  minimization  of  A  admits  the  same  solution 
ambiguity  depicted  in  Eigure  3.6.  Hence  we  see  that  integer  ambiguities  present  in  the 
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phase  approach  do  not  disappear  in  the  phasor  approach;  in  fact,  the  unwrapped  candidate 
solutions  of  the  phase-based  approach  correspond  to  minima  of  the  phasor-based  objective. 
In  practice  Phasor  Approach  techniques  may  converge  to  any  one  of  these  minima.  Hence 
the  critical  issue  for  reliable  image  reconstruction  is  again  the  nature  of  the  difference 
between  these  valid  minima  and  the  true  solutions.  Based  on  the  connections  we  have 
drawn  with  Phase-approach  solutions  above,  the  following  Proposition  is  clear: 

Proposition  3.6.4.  If  a  pattern  is  wrap-invariant,  the  global  minima  of  the  Phasor- Approach  objective 
caused  by  the  inherent  2n-periodicity  in  the  objective's  residual  differ  from  the  true  solution  merely 
by  an  image  shift. 

In  practice  we  see  that,  as  in  the  Phase  approach,  if  the  pattern  is  not  wrap-invariant, 
the  Phasor  Approach  suffers  from  false  global  minima  producing  severe  distortion  in 
the  resulting  reconstruction.  The  lower  right  panel  of  Figure  3.3  shows  the  resulting 
reconstruction  produced  by  the  Phasor  method  using  the  updates  in  Equations  (3.39)  and 
(3.40).  To  show  the  correspondence  in  solutions  between  two  types  of  approaches,  we 
provided  the  result  of  the  Phase  approach  as  the  initial  point  for  the  Phasor  approach 
as  is  common  in  practice  (Liu  et  ah,  2010),  (Zheng  et  ah,  2014).  Indeed  this  point  is  a 
global  minimum  of  A  which  the  updates  above  cannot  escape,  and  as  a  result  we  observe 
virtually-identical  distortion  to  that  of  the  Phase  approach. 

We  repeated  the  experiment  with  the  wrap-invariant  pattern  in  Figure  3.11,  and  as 
expected,  the  results  were  pristine  in  the  noiseless  case  and  virtually-identical  to  those  of 
the  Phase  approach  in  the  noisy  case.  For  completeness  the  results  are  given  in  Figure  3.13. 

As  expected,  different  initializations  of  the  updates  in  Equations  (3.39)  and  (3.40)  simply 
resulted  in  shifts  of  an  otherwise  pristine  image  in  the  reconstruction. 

3.7  Conclusions 

In  this  chapter,  we  have  examined  the  ambiguities  caused  by  the  27r-periodicity  of  inter¬ 
ferometric  phase  in  Redundant  Spacing  Calibration.  In  particular  we  have  described  their 
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Figure  3.13:  Reconstruction  Results  for  Phasor  Approach 


Phasor  Approach  (Noiseless)  Phasor  Approach  (SNR  =  25  dB) 


fundamental  presence  in  existing  RSC  methods  whether  the  observables  considered  are  the 
measured  baseline  phasors  or  their  phases.  In  the  former,  e.g.  by  Greenaway  (1990),  they 
are  manifested  as  phase-wrapping  errors,  and  in  the  latter,  e.g.  by  Marthi  and  Chengalur 
(2014),  as  false  minima.  We  have  demonstrated  that  in  either  case,  these  ambiguities  can 
result  in  noticeable  distortion  of  the  reconstructed  image. 

Using  the  Closest- Vector-Problem  formulation  of  the  unwrapping  problem  due  to  Lannes 
and  Anterrieu  (1999),  we  have  developed  the  notion  of  a  wrap-invariant  pattern.  For  wrap- 
invariant  patterns,  the  impact  of  the  27r-periodicity  can  be  completely  eliminated  (c.f.  Section 
3.4)  using  well-known  algorithms  from  lattice  theory  and  the  Smith  Normal  Form,  and 
reduced  to  a  mere  image  shift  (c.f.  Section  3.6)  when  existing,  fast  approaches  are  used. 
Phase-approach  solutions  (Arnot  et  al,  1985),  (Greenaway,  1990),  (Wieringa,  1992),  (Lannes 
and  Anterrieu,  1999))  are  commonly  used  to  quickly  obtain  an  initial  point  to  aid  in  the 
convergence  of  the  Phasor  approach.  They  are  obtained  by  selecting  specific  solutions 
from  the  general  family  in  Equation  (4.61)  via  enforcement  of  additional  constraints  on 
the  solution.  Phasor  approaches  seek  those  complex  gains  and  object  visibilities  which 
minimize  a  squared-residual  with  respect  to  the  observed  complex  visibilities  (Wijnholds 
and  Noorishad,  2012),  (Liu  et  al,  2010),  (Marthi  and  Chengalur,  2014)  using,  for  example. 
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gradient-descent  methods.  We  have  seen  that  the  same  27r-ambiguity  which  creates  a  family 
of  Phase-approach  solutions  also  produces  a  corresponding  family  of  global  minima  in  the 
Phasor  approach.  For  either  case,  our  results  show  that  for  wrap-invariant  patterns,  this 
family  represents  merely  shifted  versions  of  the  true  image.  We  have  also  extended  this 
analysis  to  show  that  wrap-invariant  patterns  admit  reliable  imaging  using  standard,  and 
generalized,  phase  closures.  Conversely  we  show  that  patterns  which  are  not  wrap-invariant 
can  suffer  from  distortion  of  the  sort  depicted  in  Figure  3.3. 

The  prognosis  for  mitigation  of  the  ambiguity  issues  raised  in  this  chapter  is  quite 
positive.  Random  patterns  appear  to  satisfy  the  wrap-invariance  condition  with  high 
probability.  Moreover,  failure  to  meet  this  condition  amounts  to  the  existence  of  a  particular 
kind  of  cycle  in  the  interferometric  graph  which  can  be  easily  isolated;  a  chief  contribution 
of  this  chapter  is  a  simple  algorithm  for  identifying  such  cycles  so  that  they  can  be  removed 
by  the  array  designer.  Finally,  we  have  shown  an  example  execution  of  the  algorithm  to 
diagnose  a  member  of  a  popular  pattern  which  is  not  wrap-invariant.  It  is  clear  that  with 
careful  array  design,  both  Phase-  and  Phasor-based  RSC  techniques  can  reliably  produce 
quality  image  reconstructions  free  from  discernible  artifacts. 
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Chapter  4 


Robust  image  reconstruction  with 
redundant  arrays  and  generalized 
closure  phases 

4.1  Chapter  Overview 

Redundant  Spacing  Calibration  (RSC)  techniques  employ  redundancy  in  the  baselines  of  a 
telescope  array  to  eliminate  the  contribution  of  atmospheric  turbulence  in  the  interferometric 
observables.  Whereas  conventional  techniques  for  this  elimination  require  the  enforcement 
of  prior  constraints  on  the  underlying  image,  RSC  algorithms  are  mathematically  well-posed 
in  nature  and  hence  require  no  such  prior  knowledge.  Traditionally  these  algorithms  have 
been  applied  directly  to  the  fringe  measurements.  However,  in  scenarios  of  low  photon 
flux,  such  as  those  arising  in  the  observation  of  dim  objects  in  space,  single-exposure 
fringe  measurements  are  not  reliable  observables  in  general.  Instead  one  must  rely  on 
time-averaged,  atmosphere-invariant  quantities  such  as  the  bispectrum.  In  this  chapter, 
we  develop  a  novel  RSC-based  algorithm  which  provides  robust  image  reconstruction 
using  integrable  atmosphere-invariant  observables.  Our  algorithm  utilizes  standard  linear 
estimation  methods,  as  well  as  techniques  from  lattice  theory  to  reliably  estimate  the 
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Fourier  phase.  Moreover,  we  provide  theoretical  and  empirical  evidence  that  generalizing 
the  classical  bispectrum  to  higher-order  atmosphere-invariant  observables,  which  we  call 
n-spectra,  can  offer  significant  performance  gains.  Our  selection  of  an  independent  and 
high-SNR  set  of  n-spectra  leverages  the  notion  of  the  minimum  cycle  basis  from  graph  theory. 
We  analyze  the  expected  shot-noise-limited  performance  of  our  algorithm  for  both  pairwise 
and  Fizeau  interferometric  architectures,  and  corroborate  this  analysis  with  simulation 
results  showing  performance  near  the  Cramer-Rao  bound.  Lastly,  we  apply  techniques  from 
the  field  of  compressed  sensing  to  perform  image  reconstruction  from  the  estimated  complex 
visibilities. 

4.2  Problem  Statement  and  Prior  Work 

Recall  from  previous  Chapters  that  interferometric  measurements  take  the  form: 


j6  =  0  +  A^  (4.1) 

where  A  is  an  (^)  x  N  matrix  whose  rows  compute  the  piston  differences  involved  in  each 
measurement. 

In  Chapter  2  we  introduced  a  class  of  algorithms  which  decouple  the  Fourier  phases 
9  from  the  atmosphere  contributions  ^  by  imposing  prior  constraints.  With  estimates  for 
sparse  samples  of  the  Fourier  Transform,  we  showed  that  the  residual  ill-posed  inverse 
problem  could  be  regularized  effectively  using  techniques  from  compressed  sensing.  As 
interferometric  systems  are  tasked  with  the  observation  of  increasingly-complex  scenes, 
approaches  which  minimize  reliance  on  specialized  prior  assumptions  are  expected  to 
gain  appeal.  In  Chapter  3  we  described  the  family  of  RSC  techniques  as  an  intrinsically 
well-posed  alternative  to  prior-regularized  phase  recovery  is  to  use  baseline  redundancy 
to  explicitly  solve  for  piston  variation.  We  also  examined  the  important  issue  of  integer 
phase  ambiguities  in  the  context  of  these  techniques.  Array  design  considerations  based  on 
the  notion  of  wrap-invariance  (Kurien  et  ah,  2016),  as  well  as  algorithms  from  lattice  theory 
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(Lannes  and  Anterrieu,  1999),  can  be  used  to  solve  these  ambiguity  issues. 

In  this  chapter  we  leverage  elements  from  both  Chapters  2  and  3  to  develop  novel 
RSC  algorithms  for  low-flux  imaging  scenarios  (as  in,  for  example,  observation  of  dim 
objects  in  space).  Here  we  must  again  rely  on  atmosphere-invariant  observables  like  the 
bispectrum  and  closure  phase.  Many  approaches  have  been  developed  for  estimation  of  the 
Fourier  phase  from  the  closure  phase  relations.  Several  early  papers  on  the  subject  (Rogstad, 
1968),(Bartelt  et  al,  1984)  proposed  solving  these  equations  recursively.  Namely,  two  initial 
Fourier  phases  0a  and  9b  associated  with  two  connected  baselines  A  and  B  can  be  initialized 
arbitrarily  without  loss  of  generality  (see  Section  3.4.1).  The  Fourier  phase  9c  of  the  residual 
baseline  C  closing  A  and  B  can  be  determined  as:  9c  =  Pabc  —  ^a  —  where  ^abc  is  the 
closure  phase  associated  with  triangle  ABC.  These  initial  phase  estimates  can  then  bootstrap 
solution  for  the  remaining  phases  using  the  appropriate  closure  relations.  While  effective 
at  high  SNR,  such  approaches  are  clearly  sub-optimal  in  terms  of  estimation  accuracy;  the 
fact  that  information  about  a  given  Fourier  phase  is  contained  in  multiple  closure  phases 
implies  there  is  a  benefit  in  joint  inference  of  the  Fourier  phases  from  the  complete  set  of 
closure  phases. 

As  we  will  quantify  later  in  this  Section,  closure  phases  are  corrupted  with  a  variable 
amount  of  measurement  noise;  closure  phases  involving  short-baseline  measurements  will 
typically  incur  less  distortion  due  their  high  visibility  (and  hence  high-SNR)  relative  to  their 
long-baseline  counterparts.  Reliable  joint  inference  from  the  closure  phases  then  entails 
minimization  of  a  weighted  sum  of  closure  or  bispectrum  residuals  (i.e.  weighted  least- 
squares  methods).  Such  methods  are  of  course  complicated  by  wrapping  issues  analogous  to 
those  described  in  Chapter  3.  To  address  these,  Haniff  (1991)  developed  an  algorithm  which 
respects  the  27r-periodicity  of  the  closure  phase.  This  approach  uses  a  conjugate-gradient 
routine  find  the  set  of  Fourier  phases  {0}  which  minimize  the  following  non-linear  objective: 


=  mod  27T{^i,cl  -  {^ii  +  di2  +  Oa)} 


i=l 


Wi 


(4.2) 


where  Nc  is  the  number  of  closures,  ^i^ci  is  the  z-th  closure  phase,  the  0,i,  0,2/  and  9q  are  the 
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Fourier  phases  associated  with  this  closure,  and  Wi  is  a  weighting  factor  proportional  to  the 
estimated  variance  of  the  closure  measurement. 

While  this  minimization  has  been  shown  to  be  successful  in  certain  cases,  it  is  prone  to 
stagnation  at  local  minima  due  to  the  non-smoothness  of  the  mod  (x)  function  (Negrete- 
Regagnon,  1996),(Thiebaut  and  Giovarmelli,  2010).  An  alternative  least-squares  approach 
suggested  by  Gorham  et  al.  (1989)  which  instead  works  with  the  closure  phasor  (i.e.  the 
normalized  bispectrum)  has  hence  generally  been  preferred  (Negrete-Regagnon,  1996).  This 
approach  replaces  the  objective  above  with  one  taking  the  form: 

.  2 

'*'2  =  E - —2 -  w-s) 

1=1 

Objective  Y2  is  clearly  a  continuous  function  and  hence  local  minima  can  be  reliably  found 
with  gradient-descent  techniques.  However  the  objective  is  non-convex  in  general,  and  hence 
gradient-based  techniques  are  not  guaranteed  to  converge  to  the  true  global  minimum. 

To  the  best  our  knowledge,  work  by  Lannes  and  Anterrieu  (1999)  was  the  first  to  explore 
the  use  of  closure  phases  within  the  RSC  framework.  In  this  work,  the  authors  develop  a 
pseudo-inverse  estimator  which  recovers  the  Fourier  phases  from  the  measured  baseline 
phases  via  an  intermediate  computation  of  the  closure  phases.  The  computation  of  the 
necessary  covariance  matrices  for  optimal  estimation  within  this  formulation  is  omitted. 
Since  it  is  in  fact  the  closure  phases  which  are  the  direct  observables  in  practical  low-flux 
scenarios,  we  present  instead  an  approach  working  solely  with  the  closure  phases  and 
compute  the  necessary  covariances  in  straightforward  manner.  Lannes  (2003)  later  proposed 
an  alternate  estimator  based  on  computation  of  the  Smith  Normal  Form  of  the  matrix 
mapping  Fourier  phases  to  closure  phases.  This  estimator  has  the  advantage  of  completely 
eliminating  the  effect  of  phase  wrapping  in  the  closure  measurements  when  appropriate 
routines  from  lattice  theory  are  used  to  pre-process  the  phase  measurements.  In  contrast, 
our  estimator  obviates  computation  of  the  Smith  Normal  Form  at  the  possible  expense  of 
an  extra  shift  in  the  recovered  image.  Such  image  shifts  are  anticipated  to  be  of  negligible 
importance  in  most  cases  of  practical  interest. 
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Though  the  bispectrum  and  associated  closure  phase  have  been  standard  interferometric 
observables  for  decades,  it  is  not  difficult  to  imagine  situations  in  which  use  of  these 
observables  imnecessarily  limits  reconstruction  performance.  In  Figure  4.1,  we  illustrate  this 
idea  with  an  example.  We  label  each  baseline  of  an  interferometric  array  with  a  visibility, 
which  is  an  indicator  of  the  strength  of  its  Fourier  component  relative  to  the  mean  intensity 
of  the  image.  In  accordance  with  the  power-law  decay  of  intensity  with  spatial-frequency 
modulus  in  an  overwhelmingly-large  fraction  of  natural  images  (see,  e.g.  Ruderman  (1994)), 
we  consider  a  visibility  distribution  which  drops  sharply  with  baseline  length.  In  standard 
bi-spectral  imaging,  the  high-spatial-frequency  phase  information  associated  with  long 
baseline  b*  is  recovered  via  forming  closures,  for  example,  with  a  short  baseline  along 
with  another  long  baseline  (Traditional).  For  a  certain  range  of  photon  fluxes,  the  SNR  of 
the  bispectrum  is  approximately  proportional  to  the  sum  of  the  reciprocals  of  the  squared 
visibilities  of  the  associated  baselines  (Kulkarni  et  ah,  1991),  i.e.: 


SNR«f,(tT 

\i=l  li  , 


-1 


(4.4) 


where  n  is  the  number  of  photoelectrons  (pe)  received  per  interference  fringe  per  exposure 
frame.  As  we  will  show  in  this  chapter,  this  SNR  model  extends  in  the  natural  way  to 
the  SNR  of  higher-order  observables  like  that  shown  on  the  right  (Generalized).  In  contrast 
with  the  Traditional  observable.  Generalized  observable  utilizes  only  short  (high-visibility) 
baselines  to  close  the  long  baseline,  resulting  in  a  near-doubling  of  the  resulting  SNR.  We 
will  henceforth  refer  to  the  higher-order  generalization  of  the  bispectrum  triple  product  as 
the  n-spectrum,  and  its  phase  as  the  generalized  closure  phase. 

Generalization  of  the  closure  phase  is  not  a  novel  concept  in  the  astronomical  community. 
Recently  Martinache  (2010)  proposed  observables  known  as  Kernel  phases,  which  are  obtained 
by  computing  an  orthogonal  basis  for  the  left-nullspace  (e.g.  using  the  Singular  Value 
Decomposition)  of  the  matrix  A  in  Equation  (4.1)  .  The  vectors  of  this  basis  are  assembled  as 
the  rows  of  a  matrix  K.  When  this  matrix  is  applied  to  the  measurements  /3,  the  contributions 
due  to  optical  path  differences  (OPD)  clearly  cancel,  leaving  only  linear  combinations  of 
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Figure  4.1:  Generalizing  the  phase  closure  concept .  SNRs  given  assume  each  aperture  contributes  h  =  2e3 
photoelectrons  to  each  fringe  in  pairwise  combination. 


the  object  phases  in  9.  These  linear  combinations  can  then  be  used  to  reconstruct  the 
Fourier  phase  via  direct  application  of  the  pseudo-inverse  of  K  (Martinache,  2014),  or  via 
application  of  prior  image  models  (Ireland,  2013).  These  studies  consider  masked-aperture  or 
segmented-pupil  scenarios  in  which  either  closure  or  Kernel  phases  can  be  obtained  directly 
via  linear  combinations  of  the  raw  phase  measurements.  Since  we  consider  the  scenario 
encountered  in  long-baseline  interferometry,  in  which  accurate  raw  phase  measurements  are 
not  reliably  obtained  via  multi-frame  averaging,  our  inference  problem  must  be  formulated 
in  terms  of  integrated  phasor  observables,  i.e.  the  n-spectra  defined  above. 

Our  approach  is  also  conceptually  similar  to  the  approach  of  baseline  bootstrapping 
introduced  in  Chapter  3,  Section  3.4.3.  In  this  approach  first  proposed  by  Roddier  (1988), 
temporal  OPD  changes  along  a  long,  low-SNR  baseline  b  are  inferred  by  combining  (or 
bootstrapping)  those  measured  on  several  short  high-SNR  baselines  along  a  closed  path 
involving  b.  In  principle,  the  phases  on  all  baselines  can  then  be  de-rotated  accordingly 
and  the  corresponding  phasors  integrated  coherently,  thereby  boosting  the  SNR  on  all 
baselines.  This  tracking  and  de-rotation  can  be  done  real-time  with  fringe  tracking  or  offline 
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on  processed  data  (Buscher,  2015).  As  in  bootstrapping,  our  approach  seeks  to  build  reliable 
observables  by  forming  closed  paths  involving  high-SNR  measurements.  However,  whereas 
bootstrapping  eliminates  differential  phase  relative  to  a  reference  exposure  frame  and  hence 
implicitly  requires  a  threshold  per-frame  SNR  on  the  shortest  baselines  for  robustness,  our 
approach  has  no  such  threshold  as  it  works  directly  on  atmosphere-invariant  observables. 

In  this  chapter,  we  develop  a  comprehensive  algorithmic  framework  for  image  recon¬ 
struction  based  on  RSC  techniques  and  compressed  sensing,  and  analyze  its  performance 
for  the  two  popular  beam-combining  architectures  used  in  optical  interferometry  {pairwise, 
and  Fizeau).  The  chapter  is  organized  as  follows.  In  Section  II,  we  derive  an  SNR  model  for 
the  n-spectrum,  as  well  as  for  the  covariance  amongst  distinct  n-spectra.  The  analysis  begins 
with  the  pairwise  case  for  simplicity,  and  mathematical  arguments  for  an  extension  to  the 
Fizeau  case  are  presented  at  the  end  of  the  section  and  in  the  Appendix.  In  Section  III,  we 
describe  a  systematic,  integer-least-squares  approach  for  reconstructing  imagery  from  object 
visibilities  and  generalized  closure  phases  (GC).  We  leverage  the  notion  of  minimum  cycle 
basis  to  select  a  set  of  GC's  of  minimum  variance  which  spans  the  subspace  of  all  closures. 
Leveraging  techniques  first  suggested  by  Lannes  and  Anterrieu  (1999)  and  the  notion  of 
wrap-invariant  array  design  introduced  in  Kurien  et  al.  (2016),  we  apply  algorithms  from 
lattice  theory  to  unwrap  the  generalized  closures.  Here  we  quantify  the  performance  gain 
afforded  by  generalizing  the  methods  of  Lannes  and  Anterrieu  (1999)  to  higher-order  cycles. 
We  also  demonstrate  close  agreement  of  the  simulation  results  with  the  theoretical  perfor¬ 
mance  developed  in  the  chapter.  We  then  apply  our  algorithm  to  an  example  reconstruction 
of  a  dim,  structured  object  from  these  generalized  closures.  Here  we  employ  a  compressed 
sensing  algorithm  which  enforces  smoothness  in  the  image  domain  by  minimizing  a  total 
variation  metric,  to  successfully  reconstruct  the  target. 

4.3  Preliminaries 

In  this  chapter  we  analyze  the  performance  of  generalized-closure-based  imaging  for  both 
the  pairwise  and  Fizeau  interferometric  architectures  (see  Section  1.2)  in  terms  of  mean- 
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squared  error  (MSE)  of  the  phase  estimates.  We  begin  with  the  pairwise  architecture  since 
it  is  significantly  easier  to  analyze.  We  then  present  a  series  of  approximations  which 
accurately  describe  the  algorithm's  performance  in  the  Fizeau  case  while  keeping  the 
complexity  of  the  mathematics  at  a  manageable  level.  In  both  cases,  the  analysis  is  based 
on  the  computation  of  the  moments  of  the  Poisson  distribution;  in  the  pairwise  case,  the 
first  two  moments  are  required,  whereas  in  the  Fizeau  case,  the  first  o  moments  are  needed, 
where  o  is  the  order  of  the  generalized  closure. 

4.3.1  Fringe  Noise  Model  for  Pairwise  Beam  Combiner 

Given  Nap  apertures,  recall  that  the  Fourier  phase  and  amplitude  of  the  object  is  encoded  in 
a  series  of  (^'’)  interference  fringes  observed  on  a  focal  plane.  In  the  pairwise  case,  each 
fringe  is  measured  on  a  separate  set  of  detectors  so  that  the  Poisson  shot  noise  incurred 
is  independent  for  each  fringe.  Suppose  each  aperture  receives  n  photons  and  that  this 
light  is  split  evenly  (e.g.  by  beam-splitter)  before  being  combined  with  the  light  from  the 
other  apertures.  Hence  each  aperture  contributes  n  =  photons  to  each  fringe.  The 

expectation  of  number  of  photons  q  at  a  given  detector  k  for  a  given  fringe  is  given  by: 

('iW)  =  ^(l  +  7cos(a;/c  +  0  +  <^))  (4.5) 

where  Np  is  the  number  of  pixels  over  which  these  photons  are  spread,  7  is  the  visibility  of 
the  fringe  (which  takes  it  value  between  0  and  1),  co  is  the  fringe  frequency,  0  is  the  Fourier 
phase,  and  (p  is  the  atmosphere  phase. 

To  retrieve  the  object  phase  and  amplitude  from  its  fringe  encoding,  we  simply  take  the 
Fourier  transform  and  evaluate  it  at  the  fringe  frequency: 

Np-l 

z  =  X:  (4.6) 

k=0 

The  expectation  of  this  quantity  is  given  by: 

(z)  =  (4.7) 


76 


where  f^f  is  the  phase  of  the  fringe. 

To  obtain  the  bispectrum  SNR,  it  will  be  useful  to  compute  the  power  in  the  fringe,  i.e. 
expectation  of  the  quantity  zz* : 


{zz*)  =  (  E  E 

\  Jc=0  )c'=0 


This  is  equivalent  to: 


{zz*)  =  E  E 

k'=0 


Using  the  first  two  moments  of  the  Poisson  distribution,  we  can  write: 


(4.8) 


(4.9) 


{q{k)q{k'))  =  {q{k)){q{k'))  +Skk'{q{k)) 


(4.10) 


Substituting  into  Equation  (4.9),  we  obtain: 


Np-l 


Np-1 


Np-1 


(22*)  =  E  E  +  E  kk)) 

k=0  k>=0  k=0 


Simplifying  we  obtain: 


(zz*)  =  +  2n 


(4.11) 


(4.12) 


4.3.2  N-Spectrum  Covariance  and  SNR  Models  for  the  Pairwise  Architecture 

Let  us  define  a  generalized  N-spectrum  G  as  the  product  of  complex  baseline  measure¬ 
ments  along  an  o-sided  loop.  Given  variances  ^i(G)  and  (G)  of  real  and  imaginary 
components,  respectively,  we  define  the  pseudo-variance  of  the  product  as: 


VpairzuiseiG)  :=  =  (GG*)  -  (G)(G*) 


(4.13) 


or: 


i^vairwiseiG)  — 


i=l 


1=1 


(4.14) 


i^pairwise{G)  — 


n(«^7f +  2n) 


!  =  1 


1=1 


(4.15) 
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In  the  specific  case  of  the  classic  triple  product  (i.e.  the  bispectrum),  this  expands  to 


(Kulkarni  et  al,  1991): 


VpairmiseiG)  =  2n^  {^hl  +  TiTs  +  ihi)  +  4n^(7i  +  72  +  73)  + 


(4.16) 


The  SNR  S pairwise  ■= 


is  then  given  by: 


7i7273n2 


^pairwise 


(4.17) 


^  +  Thi  +  737?)  +  2n(7f  +  72  +  72)  +  4 

To  gain  insight  into  the  limiting  behavior  of  the  SNR,  we  make  two  approximations: 
A  low-SNR  approximation  can  be  found  by  considering  only  the  constant  term  in  the 
denominator,  which  will  dominate  when  n  and/ or  the  7,-  are  small. 


5 


pairwise, low 


7i7273n2 


(4.18) 


A  high-SNR  approximation  can  be  found  by  considering  the  highest-order  term  in  the 
denominator,  which  will  dominate  when  n  is  large  and/or  the  7,  are  large. 


5 


pairwise, high 


7l7273n2 


niiii  +  ibi  +  ibl) 


(4.19) 


By  averaging  the  fringe  phasors  over  a  sufficient  number  of  frames  Nf,  we  can  build 
the  SNR  to  a  level  at  which  the  bispectrum  phase  (i.e.  the  closure)  phase  can  be  measured. 
For  phasor  SNRs  sufficiently  greater  than  1,  we  can  apply  the  following  well-known 
approximation  for  the  variance  on  the  corresponding  phase  6^/  (see  e.g.  Walkup  and 
Goodman  (1973)): 


1 


N/S2 


(4.20) 


We  then  obtain  the  following  approximations  for  the  closure  phase  variances.  At  low 
SNR,  we  have: 


’'6^1, low 


(4.21) 
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Taking  the  logarithm  we  find  that  the  variance  decouples: 


log  ~  -2  log  71-2  log  72-2  log  73  -  3  log  n  +  C  (4.22) 


where  C  is  a  constant  independent  of  the  flux  level  ft  and  the  visibilities  7,. 
At  high  SNR,  we  have: 


^Bchhigh 


11  1  1 
nNf  7f  7^  jI 


Generalizing  the  results  above  to  n-spectra,  we  obtain  for  low  SNR, 


(4.23) 


Nfn^UUlt 


(4.24) 


and  for  high  SNR, 


2  ^  ^ 

%,,high  ~ 

where  0  is  the  order  of  the  generalized  closure. 
Equation  (4.26),  we  obtain: 


0  I 

Ei 

!=1  h 


(4.25) 

Note  that  if  we  take  the  logarithm  of 


2  ° 

log  ~  0  log  V  -  2  X]  log  7/  -  log  N/  (4.26) 

^  i=l 

4.3.3  Covariance  Matrix  of  the  Generalized  Closure  Phases  for  the  Pairwise 
Architecture 

In  this  section  we  generalize  results  of  Kulkarni  et  al.  (1991)  to  obtain  expressions  for 
the  n-spectra  covariance  matrices.  We  will  first  compute  the  covariance  between  two  n- 
spectrum  phasors  Gi  =  Re[Gi]  +jIm[G-i],  with  E[Gi]  =  gi,  and  Gi  =  Re[G2]  +  ilm[G2\, 
and  £[G2]  =  g2-  Let  I  and  /  denote  the  set  of  baselines  in  n-spectra  Gi  and  G2,  respectively. 
Then  we  have: 


c^(Gi,G|) 


iei  jej  /  iei  jej 


(4.27) 
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We  can  then  factor  out  the  factors  common  to  Gi  and  G2  from  those  which  are  distinct 
to  obtain: 

(7(Gi,G|)=  Yl  n  i^k){zl)  Ylizi)  n  (4)  (4-28) 

jceinj  keinj  }  iei\J  me]\i 

Making  the  appropriate  substitutions  from  Equation  (4.12),  we  obtain: 

t7(Gi,G|)  Yl  (n^7^  +  -  Yl 

_keir\}  keinj 

^  Yl^li  Yl  ”4'’"  (4.29) 

lei\J  me]\i 

where  ^zi;t/Z2*  =  1  if  the  common  measurements  are  the  same,  and  zero  if  they  are  conjugates 
of  each  other.  Similarly, 

t7(Gi,G2)  n  +  2n4.*,z2,J  -  Yl  ^^Yk 

_keir\}  keinj 

X  n  n 

lei\J  me]\i 

where  Szj^,^,z2j,  =  1  if  the  common  measurements  are  conjugates  of  each  other,  and  zero  if 
they  are  the  same. 

We  can  now  compute  real  and  imaginary  n-spectra  covariance  as: 

a{Re[Gi],Re[G2])  =  ^Re[Cor?(Gi,  G|)  +  Coz?(Gi,  G2)]  (4.31) 

(7{Im[G^],Re[G2\)  =  ^l?w[Cor’(Gi,  G|)  +  Cor’(Gi,  G2)]  (4.32) 

cr(Re[Gi],l?w[G2])  =  ^lm[-Coz?(Gi,  G^)  +  Coi^(Gi,  G2)]  (4.33) 


80 


(4.34) 


cr{Re[GilIm[G2])  =  ^Re[-Cov{Gi,G*2)  -  Cov{Gi,G2)] 

Given  the  n-spectra  covariance  expression  above,  we  can  now  approximate  the  co- 
variance  between  two  generalized  closure  phases  using  Taylor  series  approximation  of 
the  moments  (also  known  as  the  delta  method).  First  we  define  the  angle  function  of  a 
phasor  as  0{x,y)  =  arctan  The  corresponding  generalized  closure  phases  are  given  by: 
©1  =  0(Jm[Gi],Re[Gi])  and  ©2  =  0(Jm[G2],Re[G2]).  Applying  the  delta  method  yields  the 
following  approximation: 


^(0.-02)  «  (||  J  (||^)  ^(R4C,],In,lG,])  (4,35) 

Evaluation  of  the  derivatives  in  the  expression  above  yields: 


4.3.4  Covariance  Matrix  Approximations  for  the  Fizeau  Architecture 

In  the  Fizeau  case.  Equation  (4.12)  becomes: 
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as: 


(zz*)  =  +  Napn  (4.40) 

As  shown  in  Appendix  C.l,  we  can  approximate  the  variance  of  a  n-spectrum  of  order  o 


^Fizeau  (G) 

Hence  the  SNR  can  be  approximated  as: 


Ylin^lf  +  Napn) 


-n^Tlyf 

i=l 


(4.41) 


5, 


Fizeau 


VlmUUli 


(4.42) 


[nLi  (n^lj  +  Napn)] 

Applying  the  same  approximations  in  Appendix  C.l  to  the  covariance  yields  the  follow¬ 
ing  analogues  to  Equations  (4.29)-(4.30) 


c7(Gi,G|)  =  e27ry(ZGi-ZG2) 


X 


n 

keinj 


n 

keinj 


n  n 

lei\}  mej\i 


(4.43) 


and. 


(t{Gi,G2)  =  e2^K^Gi+ZG2) 


X 


n  +  N„pn4i,„z2,J  -  n  ^^^k 

keinj  keinj 


X  n  n 

lei\J  mej\i 


(4.44) 


4.4  Fourier  Phase  Recovery  using  the  N-Spectra 

In  this  section,  we  briefly  review  the  integer  least-squares  approach  for  RSC  phase  recovery 
(see  e.g.  Lannes  and  Anterrieu  (1999),  Kurien  et  al.  (2016)  for  more  comprehensive  treat¬ 
ments),  and  then  describe  an  algorithm  for  selection  of  a  set  of  generalized  closure  relations 
of  minimum  total  variance. 
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4.4.1  RSC  Phase  Recovery  with  Wrap-Invariant  Patterns 

We  begin  by  recalling  the  fundamental  RSC  equation  from  Chapter  3: 


M  =  +  2716  (4.45) 

n 

In  this  Section  we  discuss  generalized  closure  relations  which  eliminate  the  ^  contribution 
in  the  model  in  Equation  (4.45).  We  first  recall  from  Chapter  3  (see  Definition  3.4.1)  that 
following  the  analysis  of  Larmes  and  Anterrieu  (1999),  the  range  of  the  matrix  M  can  be 
divided  into  two  subspaces:  (1)  the  subspace  K  spanned  by  the  d  columns  associated  with 
the  Fourier  phases,  or  equivalently  the  range  of  the  sub-matrix  Mg  formed  by  these  columns, 
and  (2)  the  piston,  or  aberration,  phase  space  L  is  the  span  of  the  Nap  columns  associated 
with  the  piston  phases,  or  equivalently  the  range  of  the  sub-matrix  formed  by  these 
columns.  Note  that  the  d  columns  spanning  K  are  linearly-independent  by  virtue  of  having 
non-zero  entries  in  mutually  disjoint  sets  of  baseline  indices.  Hence  dim{K)  =  d.  The 
subspace  L  has  dimension  Nap  —  1  (Larmes  and  Anterrieu,  1999);  the  constant  vector  forms 
the  one-dimensional  nullspace  of  M^. 

Additionally,  let  us  establish  the  following  Definitions: 

Definition  4.4.1.  (Interferometric  graph):  The  interferometric  graph  of  an  array  is  the 
directed  graph  whose  vertices  are  the  Nap  apertures  in  the  array  and  edges  are  the 
baselines  cormecting  all  vertex  pairs. 

Definition  4.4.2.  (Cycle-space  of  a  directed  graph)  (Liebchen  and  Rizzi,  2005):  Given  a 
directed  graph  G  with  a  set  of  vertices  V  and  edges  E,  the  cycle  space  of  the  graph  is 
the  vector  space  of  sparmed  by  the  incidence  vectors  of  cycles  with  a  (clockwise,  or 
counter-clockwise)  orientation.  It  can  be  shown  that  the  dimension  of  the  cycle  space  is 
M  -|-  N  —  1,  where  M  =  |£|  and  N  =  |y|. 

Suppose  we  have  a  basis  for  the  cycle-space  of  the  interferometric  graph.  Let  us  stack  the 
elements  of  this  basis  as  row  vectors  in  a  ^)  X  (^"P)  matrix  Cmc,  which  maps  baseline 
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Figure  4.2:  One  possible  cycle-basis  for  a  simple  interferometric  graph.  Note  the  fourth  triangle  (i.e.  baseline 
set  {2,4,6})  can  be  expressed  as  a  linear  combination  of  the  cycles  shown. 


phase  measurements  to  generalized  closure  phases.  Recall  that  closure  relations  eliminate 
piston  differences  in  the  measurements  so  that  Cmc  annihilates  the  subspace  L,  i.e.  the  space 
spanned  by  the  columns  of  M  corresponding  to 

In  Figure  4.2,  we  show  a  simple  four-aperture  interferometric  graph  and  one  possible 
cycle  basis  for  the  graph.  For  this  particular  example,  we  can  write: 

(4.46) 

where  the  column  (i.e.  baseline)  indexing  in  the  matrix  follows  the  labeling  in  the  Figure. 
Note  that  the  final  triangle  cycle  (i.e.  the  one  containing  baselines  2, 4,  and  6,  and  represented 
by  the  row  vector  w*  :=  (0, 1,0, 1,0,  —1)),  can  be  represented  as  a  linear  combination  of  the 
cycles  in  the  basis:  w*  =  ci  -|-  C2  —  C3,  where  {C;}  are  the  rows  of  Cmc 

While  for  purposes  of  simplicity  we  have  chosen  a  cycle  basis  consisting  exclusively  of 
three-baseline  cycles,  in  general  the  elements  of  a  cycle  basis  can  contain  any  number  of 
edges. 

Lemma  4.4.1.  The  nullspace  of  Cmc  is  L. 

Proof:  Clearly  the  Nap  columns  of  L  are  in  the  nullspace  of  Cmc  from  the  arguments  above. 
To  see  that  these  columns  span  the  nullspace,  note  that  from  the  well-known  Rank-nullity 
theorem  from  linear  algebra  (see,  e.g.  Bretscher  (2001)),  we  have: 


Cmc  — 


11  1  0  0  0 

0  0-1110 
10  0  oil 
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dim{ker{Cnic))  = 
Hence  dim{ker{Cmc))  =  dim{L). 


ap 


^)=Nap-l 


(4.47) 

□ 


9 

r 

e 

CmcM 

= 

Coc  0 

(4.48) 


where  Coc  ■=  Cmc^e,  arid  is  the  mapping  between  object  phases  and  closure  phases. 


Proposition  4.4.2.  Let  Ar^  and  Ary  denote  the  vectors  containing  the  x-  and  y-coordinates  of  the 
baselines  in  an  array,  respectively.  If  the  array  is  a  valid  RSC  array,  these  columns  form  a  basis  for 
the  two-dimensional  nullspace  of  Coc- 


Corollary  4.4.3.  For  a  valid  RSC  array,  the  mapping  Coc  Is  injective  up  to  an  image  shift. 

Proofs  of  Proposition  4.4.2  and  Corollary  4.4.3  are  given  in  Appendix  C.2.  □ 

Let  us  define  jd  as  the  observation  vector  of  wrapped  generalized  closure  phases,  and 
2neci  the  wrapping  vector  (with  integer  e^/).  Our  phase  measurement  model  is  then  given 
by: 


CocO  =  ye  +  n  (4.49) 

where  ye  =  jd  +  2ned,  and  n  is  the  measurement  noise,  which  we  shall  assume  in  this 
chapter  is  predominantly  due  to  shot  noise  on  the  focal  plane. 

Given  that  the  generalized-closure  mapping  is  injective  up  to  an  image  shift,  we  can 
now  formulate  the  well-posed  recovery  of  the  Fourier  phases  as  the  following  generalized, 
integer  least-squares  problem,  which  finds  a  vector  in  the  range  of  the  closure  matrix  at  a 
minimum  Mahalanobis  distance  from  the  actual,  wrapped  measurement  vector: 

Orsc  =  ~  (je  -  CocO^  (4.50) 

where  £  is  an  estimator  of  the  covariance  matrix  of  the  phase  measurements.  This  estimator 
can  be  obtained  by  substituting  estimates  for  the  object  visibilities  {7;}  into  the  covariance 
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expressions  in  the  previous  section.  It  is  acknowledged  that  this  least-squares  solution  is 
neither  the  optimal  nor  maximum-likelihood  solution  to  the  phase  inference  problem.  For 
one,  a  Gaussian  distribution  on  the  closure  phases  is  implicitly  assumed  to  hold,  and  can 
only  approximately  hold  for  high-SNR  closure  phases.  It  is  therefore  assumed  that  n-spectra 
will  be  integrated  for  a  sufficient  number  of  frames  for  this  approximation  to  be  reliable  for 
most  of  the  closure  phases. 

Assuming  the  covariance  matrix  estimator  is  not  degenerate,  it  will  admit  a  Cholesky- 
decomposition  L  =  BB^.  Equation  (4.50)  is  then  equivalent  to  searching  for  vectors  e  which 
minimize  the  projection  of  a  whitened  measurement  B“^ye  onto  the  subspace  defined  by 
ker{{'B^^Coc)^)-  Specifically  we  seek  to  minimize: 

/(e)  =  ||PzB-i(y,/  +  27re,,)||'  (4.51) 

where  is  a  matrix  representing  the  orthogonal  projection  from  K”  onto  ker{{B^^Coc)'^)- 

Letting  e'^  =  —ed,  we  can  rewrite  the  above  objective  function  as: 

/(e')  =  ||PzB-i(y,,  -27re',)||"  =  \  (4.52) 

This  optimization  problem  is  equivalent  to  the  so-called  closest  vector  problem  (CVP)  in 
the  theory  of  lattices.  The  connection  of  interferometric  phase-unwrapping  with  the  CVP 
problem  was,  to  the  best  of  our  knowledge,  first  explored  in  Lannes  and  Anterrieu  (1999).  In 
this  paper,  the  authors  formulate  an  analogous  minimization  problem  for  the  raw  baseline 
measurements. 

We  will  define  a  lattice  L  (Z'^ )  as  the  set  of  points  generated  by  integer  combinations 
of  the  column  vectors  of  a  matrix  L.  Letting  P  =  P^L"^,  our  optimization  problem  then 
amounts  to  the  following:  Find  the  lattice  point  in  P(Z”)  which  is  closest  to  Pyc/.  A  compact 
representation  of  the  lattice  F  is  given  by: 

{m<n—{d+N—3)  'j 

UiVi  I  Vfl/  G  Z I  (4.53) 

where  {v,}  are  linearly-independent  and  together  form  a  basis  of  the  lattice.  Given  the 
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lattice  basis,  several  algorithms  exist  for  finding  the  closest  lattice  point  to  a  specified  vector. 
A  popular  class  of  algorithms,  known  as  the  Sphere-Decoding  algorithms,  are  efficient 
searches  for  the  closest  lattice  point  within  a  hypersphere  of  a  certain  radius  centered  on  the 
input  vector  (see  e.g.  Agrell  et  al.  (2002)).  For  the  simulations  in  this  chapter,  we  instead  use 
the  lower-complexity  Bahai  Nearest  Plane  (Babai-NP)  algorithm  (Bahai,  1986).  For  lattice 
bases  which  are  nearly  orthogonal  (such  as  those  we  use  for  our  simulations),  this  algorithm 
offers  reliable,  albeit  not  guaranteed,  performance  in  practice. 

Suppose  we  have  foimd  a  basis  for  the  lattice  P(Z”),  and  we  have  solved  the  Closest 
Vector  Problem  for  a  given  closure  vector  y^/.  Let  b*  be  the  output  of  the  Bahai  Nearest 
Plane  Algorithm  -  i.e.  it  is  the  lattice  point  which  is  the  closest  to  y^/  ^  We  now  seek  to  solve 
for  the  wrap  vector  corresponding  to  this  lattice  point,  i.e.  we  seek  a  solution  to: 

b*  =  Pe  (4.54) 

Note  that  P  is  a  (weighted)  projection  matrix  and  thus  not  full-rank,  and  therefore 
there  will  be  mfinitely-many  solutions  to  this  equation.  Indeed  our  solution  is  correct 
only  to  within  an  integer  vector  eyesid  iri  the  range  of  Coc-  Flence  there  is  a  fundamental 
ambiguity  that  needs  to  be  addressed.  Once  the  measurement  vector  is  unwrapped,  there 
are  various  ways  to  solve  the  RSC  system  which  respect  the  presence  of  a  residual  ambiguity 
eygsid  described  above.  Following  standard  least-squares  principles,  we  first  compute  via 
projection  the  vector  y*^  in  im{Coc)  closest  (in  Mahalanobis  distance)  to  the  imwrapped 
measurement  vector,  i.e. 


y.V(c)  =  ®‘‘(i-'’j:)»yd  («5) 

Per  the  first  method  (Lannes,  2003),  we  can  compute  the  Smith  Normal  Form  (SNF)  (see 
Theorems  3.4.2  and  3.4.3)  {U,  D,V}  of  our  matrix  Coc,  and  set  Uc  =  =  D,  and 

^It  is  well-known  that  the  performance  of  the  Bahai  algorithm  and  other  CVP  algorithms  is  improved  when 
the  lattice  basis  is  reduced  as  mentioned  in  Section  3.2.1.  We  used  the  LLL  algorithm  implementation  due  to 
Zhou  (2014)  in  our  simulations. 
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Vc  -  v-1. 


Coc  =  UcDcVc  (4.56) 

A  valid  RSC  solution  can  then  be  obtained  by  evaluating: 

Srsc  =  Vc‘D+Uc‘yA/„(c)  (4-57) 

where  D^t  denotes  the  pseudo-inverse  of  D. 

The  effect  of  the  residual  ambiguity  vector  6^51^  on  this  solution  requires  careful  consid¬ 
eration.  Note  that  the  resulting  wrap-induced  error  in  this  solution  is  given  by: 

IneRsc  =  (4.58) 

Since  the  matrices  Uc  and  are  unimodular,  they  are  invertible  over  the  integers.  If 
all  the  elementary  divisors  in  Dc  are  equal  to  1,  then  will  also  be  integral,  and  hence 
esse  will  be  integral.  This  in  turn  guarantees  that  the  final  error  in  the  Fourier  phase  will 
be  0  mod  2ti,  i.e.  that  the  RSC  solution  9rsc  is  wrap-invariant.  In  Chapter  3  we  developed 
sufficient  conditions  under  which  an  interferometric  array  is  wrap-invariant  in  this  sense.  We 
review  the  main  result  below  for  completeness. 

Definition  4.4.3.  (Persistent  loop  set):  A  set  of  cycles  in  the  interferometric  graph  which 
contains  at  least  two  instances  of  every  baseline  contained  in  the  set.  (A  set  can  consist  of 
any  number  of  loops,  including  one.) 

Definition  4.4.4.  (Sufficient  condition  for  Wrap-invariant  pattern):  Consider  the  graph  of 
an  RSC  aperture  pattern  which  contains  d  distinct  baselines  and  any  set  of  N  —  3  linearly- 
independent  redundant  baselines.  If  this  graph  does  not  contain  any  persistent  loop  sets, 
the  aperture  pattern  is  wrap-invariant. 

Definition  4.4.5.  (Integer  Cycle  Basis):  An  integer  cycle  basis  is  a  cycle  basis  in  which  every 
cycle  in  the  basis  can  be  expressed  as  an  integer  combination  of  other  cycles  in  the  basis. 
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Equivalently,  the  elementary  divisors  of  the  corresponding  cycle  matrix  Cmc  are  all  1. 
(Liebchen,  2003)  ^ 

Proposition  4.4.4.  (Sufficient  condition  on  an  aperture  pattern  for  wrap-invariant  closure 
imaging):  If  the  interferometric  pattern  is  wrap-invariant,  the  RSC  phase  solutions  derived  from 
(generalized)  closures  in  an  integer  cycle  basis  are  wrap-invariant. 

For  the  latter  condition,  we  see  from  Proposition  4.4.4  that  an  integer  cycle  basis  can  be 
constructed  as  the  generalized  closure  set  formed  by  closing  the  edges  of  any  spanning  tree 
in  the  interferometric  graph  (Kurien  et  al,  2016).  In  fact  we  have  observed  that  in  practice  it 
is  satisfied  with  high  probability  for  randomly-chosen  linearly-independent  sets  of  closures, 
of  which  the  minimum-basis  cycles  discussed  in  the  next  section  form  a  subset.  However, 
the  former  condition  can  be  violated  by  certain  highly-symmetric  patterns  (e.g.  see  Figure 
3.2). 

As  it  turns  out,  for  wrap-invariant  patterns,  the  inverse  problem  in  Equation  (4.49)  can  be 
solved  reliably  even  without  the  Smith  Normal  Form;  it  is  sufficient  to  find  any  unimodular 
r-by-r  sub-matrix  C  of  Coc  and  solve  the  associated  smaller  system  in  Equation  (4.59)  to 
obtain  a  valid  solution.  Recall  that  for  valid  RSC  arrays  Coc  will  have  r  -|-  2  columns  and 
hence  in  this  approach,  two  (non-collinear)  object  phases  are  implicitly  set  to  zero.  This 
implicit  selection  then  fixes  the  fundamentally-ambiguous  translation  of  the  scene  discussed 
at  the  begirming  of  the  section. 


0  =  C-V:/..(C)  (4-59) 

Since  C  is  unimodular,  C^^  will  have  solely  integral  entries  so  that  the  resulting  wrap 
error  C^^eresid  =  0  mod  2tz.  Assuming  the  measured  phase  has  been  correctly  unwrapped, 
the  solution  in  Equation  (4.59)  amounts  to  a  so-called  basic  solution  of  our  generalized 
least-squares  problem.  We  then  have  the  following  expression  for  the  error  covariance 
matrix  for  the  estimator  9rsc  (Kay,  1993). 

^Integral  cycle  bases  also  find  application  in  the  field  of  cyclic  timetabling.  The  interested  reader  is  directed 
to  Liebchen  (2003). 
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^basic  =  (C^E-lC)-l 


(4.60) 


The  basic  solutions  belong  to  the  countably-infinite  set  of  solutions  to  Equation  (4.49).  It 
turns  out,  somewhat  surprisingly,  that  any  particular  solution  within  this  set  can  be  reliably 
limited  to  a  mere  image  shift  for  patterns  that  are  wrap-invariant.  The  complete  set  is  given 
by: 


d-  =  Cocy:iMC)  +  ^0  (4.61) 

where  'o  is  any  vector  in  the  nullspace  of  Coc,  and  C+  denotes  the  pseudo-inverse  of  Coc- 
The  pseudo-inverse  can  be  computed  using  the  Singular- Value-Decomposition  (SVD)  of  Coc, 
which  is  given  by: 


Coc  =  (4.62) 

where  U^t-  and  V^-  are  x  and  d  X  d  orthogonal  matrices,  respectively.  Ep-  is  a 

(^“P^i)  X  d  diagonal  matrix  with  r  non-zero  diagonal  entries  (the  so-called  singular  values  of 
Coc  ),  where  r  =  rank{Coc)  =  d  —  2. 

The  Moore-Penrose  pseudo-inverse  is  then  given  by  (Bretscher,  2001): 

C+  =  U,E+Vj  (4.63) 

where  E+  is  a  diagonal  matrix  whose  r  non-zero  diagonal  entries  are  the  reciprocals  of  the 
corresponding  non-zero  entries  in 

In  Chapter  3,  we  showed  that  the  effect  of  wrapping  on  the  RSC  pseudo-inverse  solution 
to  Equation  (4.45)  can  be  reliably  limited  to  an  image  shift  for  wrap-invariant  patterns.  An 
analogous  result  and  proof  apply  to  closure-based  RSC  imaging: 

Proposition  4.4.5.  For  wrap-invariant  patterns,  the  error  induced  by  wrapping  of  the  (generalized) 
closure  phases  can  be  limited  to  an  image  shift. 

Proof:  See  Appendix  C.3  □ 
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We  can  leverage  this  result  to  establish  an  analogous  result  pertaining  to  the  well-known 
non-linear  least-squares  formulation  due  to  Gorham  et  al.  (1989)  discussed  in  Section  4.2. 
Namely  we  have  the  following  Corollary: 

Corollary  4.4.6.  For  wrap-invariant  patterns,  the  set  of  solutions  minimizing  Objective  Y2  in 
Equation  4.3  differ  from  the  true  solution  by  an  image  shift  in  the  noiseless  case. 

Proof:  For  Objective  Y2  to  be  zero,  clearly  each  term  (i.e.  each  squared-residual)  in  the 
summation  must  be  zero.  Hence  we  must  have  |  yi.  These  constraints 

are  clearly  equivalent  to  the  linear  phase  constraints  solved  by  the  family  of  solutions  in 
Equation  (4.61).  Hence  the  solution  sets  are  the  same  in  the  noiseless  case.  □ 

The  covariance  of  elements  in  the  pseudo-inverse  solution  can  be  expressed  as  (Mont¬ 
gomery  et  al,  2006): 


(4.64) 

where  Ya-  is  obtained  by  omitting  the  final  2  columns  of  V,,  which  form  a  basis  for  the 
nullspace  of  Coc- 

4.4.2  Selection  of  the  N-Spectra 

In  this  section,  we  describe  a  strategy  for  obtaining  a  near-optimal  linearly-independent  set 
of  generalized  closure  phases.  This  strategy  is  founded  on  the  notion  of  minimum  cycle  basis 
from  graph  theory. 

Definition  4.4.6.  (Minimum  cycle  basis  of  a  directed  graph):  Let  each  edge  of  the  graph 
be  assigned  a  positive  weight  Wj^,  and  the  weight  of  a  cycle  be  defined  as  the  sum  of  the 
weights  of  its  constituent  cycles. 

With  these  Definitions,  we  can  then  ask  for  the  set  of  linearly-independent  oriented 
cycles  which  spans  the  cycle  space  and  has  minimum  total  weight.  We  call  such  a  set  the 
minimum  cycle  basis  of  a  directed  graph  G. 
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We  next  state  a  fundamental  lemma  that  establishes  a  practical  method  for  obtaining  a 
minimum  cycle  basis,  for  which  the  proof  can  be  found  in  many  textbooks  on  graph  theory 
(see  e.g.  Gross  and  Yellen  (2006)). 

Lemma  4.4.7.  (Optimality  of  the  Greedy  Algorithm):  Consider  a  vector  space  V,  a  set  of  vectors 
which  span  V,  and  a  corresponding  set  of  weights  for  each  of  these  vectors.  A  basis  of  minimum  total 
weight  can  be  found  by  first  sorting  the  vectors  in  decreasing  order  of  weight,  and  then  selecting 
vectors  for  the  basis  in  this  order  if  and  only  if  they  are  linearly-independent  of  previous  selections. 
This  is  known  as  the  greedy  algorithm. 

From  Lemma  4.4.7,  we  can  find  a  set  of  linearly-independent  closure  relations  of  minimum 
total  variance  by  employing  the  greedy  algorithm  on  the  set  S  of  all  possible  closure  relations. 
However,  this  set  is  equivalent  to  the  set  of  all  possible  cyclic  permutations,  which  becomes 
enormous  for  arrays  of  large  size.  Fortunately,  in  an  extension  of  a  result  due  to  Horton 
(1987)  for  undirected  graphs,  Liebchen  and  Rizzi  (2005)  showed  that  the  elements  of  the 
minimum  cycle  basis  of  a  directed  graph  could  be  found  among  a  special  (and  much  smaller) 
subset  of  S  known  as  the  Horton  cycles.  We  describe  these  cycles  through  the  following 
definitions: 

Definition  4.4.7.  (Shortest  path  tree):  The  shortest  path  tree  of  a  graph  G  with  respect  to  a 
node  A  is  the  spanning  tree  which  connects  node  A  to  each  other  node  in  G  via  a  path  of 
minimum  weight. 

Figure  4.3  gives  a  simple  example  of  a  shortest  path  tree  for  an  interferometric  graph  in 
which  the  top-most  node  is  the  root.  The  construction  of  shortest  path  trees  is  a  well-studied 
problem  in  graph  theory  (see  e.g.  Gross  and  Yellen  (2006))  which  is  typically  solved  using  a 
shortest-path  routine  such  as  Dijkstra's  Algorithm  (Dijkstra,  1959). 

Definition  4.4.8.  (Horton  cycle):  Given  the  shortest  path  tree  T  originating  from  an  arbitrary 
node  A,  a  Horton  cycle  is  a  cycle  formed  by  connecting  the  endpoints  of  any  two  branches 
of  T. 
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Figure  4.3:  A  shortest  path  tree 


-  -  -  Shortest-Path  Tree 

Lemma  4.4.8.  (Minimum-cycle-basis  elements  are  Horton  cycles):  All  elements  of  the  minimum- 
cycle-basis  are  Horton  cycles.  Therefore  in  searching  for  the  minimum-cycle-basis,  it  suffices  to  search 
the  Horton  cycles.  Proof:  We  summarize  the  Proof  due  to  Liebchen  and  Rizzi  (2005)  in  Appendix 
C.4. 


To  exploit  the  Lemma  above,  we  seek  a  decomposition  of  the  weight  of  a  cycle  into  the 
weights  of  its  constituent  edges  (i.e.  baselines).  As  the  following  Propositions  show,  at 
the  extreme  low-  and  high-SNR  limits,  the  n-spectrum  variance  does  indeed  allow  such  a 
decoupling. 

Proposition  4.4.9.  (Low-SNR  Decoupling  Approximation):  In  the  low-SNR  limit,  the  logarithm 
of  the  total  variance  of  any  cycle  is  given  (up  to  a  global  additive  constant)  by  the  sum  of  the  costs 
assigned  to  each  baseline  in  the  cycle,  where  the  cost  of  a  baseline  k  with  visibility  'Yk  is  given  by: 

CJc  =  log|  -21og7fc. 

Proof:  This  can  be  seen  by  re-writing  the  low-SNR  approximation  in  Equation  (4.26)  as: 

log  ~  E  log  I  -  2  log  7;  +  C  (4.65) 

where  C  =  logN^  is  a  constant  with  respect  to  both  visibility  and  flux  and  therefore 
irrelevant  for  the  purposes  of  closure  selection.  □ 

Proposition  4.4.10.  (High-SNR  Decoupling  Approximation):  In  the  high-SNR  limit,  the  total 
variance  of  any  cycle  is  given  (up  to  a  global  scaling  factor)  by  the  sum  of  the  costs  assigned  to  each 
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baseline  in  the  cycle,  where  the  cost  of  a  baseline  i  with  visibility  is  given  by:  Ci  = 

Proof:  This  follows  directly  from  the  high-SNR  approximation  in  Equation  (4.25).  □ 

Based  on  the  Propositions  above,  we  now  have  a  way  to  construct  a  set  of  linearly- 
independent  generalized  closures  of  minimum  variance  at  the  low-  and  high-  SNR  extremes. 
As  demonstrated  in  the  next  section,  even  at  moderate  SNRs,  this  closure  selection  method 
can  yield  significant  improvements  over  traditional  closure  imaging  restricted  to  baseline 
triangles.  To  estimate  the  scaling  of  the  computational  cost  of  the  method,  recall  that  the 
greedy  algorithm  is  run  on  the  set  of  Horton  cycles.  Since  there  are  Nap  shortest-path  trees, 
each  rooted  at  a  distinct  aperture,  and  m  —  (Nap  —  1)  cycles  in  each  of  these  trees,  the  total 
number  of  Horton  cycles  is  0{mNap)-  Since  m  =  0(Nl^),  we  see  that  the  algorithm  is 
0{N^p),  which  is  the  same  complexity  as  if  the  greedy  algorithm  were  run  on  an  exhaustive 
listing  of  all  three-baseline  closures. 


4.5  A  Practical  Algorithm  for  RSC  Closure  Imaging 

In  this  section  we  synthesize  the  techniques  developed  in  previous  sections  into  an  algorithm 
for  RSC  imaging  using  generalized  closures.  Recall  that  closure  selection  relies  on  estimates 
of  the  baseline  visibilities  {7;}.  These  estimates  can  be  derived  for  the  pairwise  case  by 
solving  Equation  (4.12)  for  7  to  yield: 


V  (zz*)  —  2h 

7/  =  — z - 

n 

Similarly,  for  the  Eizeau  case,  we  can  solve  Equation  (4.40)  to  obtain: 


(4.66) 


7/  = 


y (zz*)  -  Napn 


(4.67) 


Having  estimated  both  Eourier  phases  and  visibilities,  we  can  form  estimates  for  the 
complex  visibilities  y  of  the  object.  To  estimate  the  image  coefficients  x  from  these  complex 
visibilities,  we  must  apply  a  deconvolution  algorithm.  Based  on  the  demonstrated  effective¬ 
ness  of  recent  compressed  sensing  techniques,  we  choose  the  following  regularization  to 
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perform  this  deconvolution: 

We  use  the  standard  CS  reconstruction  technique  called  Total  Variation  Minimization  to 
solve  (B)  Becker  et  al.  (2011): 


X  =  argmin^  ||a||jy  subject  to:  ||Fft;  —  y||2  <  e  (4.68) 

where  F  is  a  partial  Fourier  matrix  whose  rows  are  vectorized  representations  of  the  2D 
sinusoids  associated  with  the  array's  measured  spatial  frequencies. 

Algorithm  3  provides  the  complete  set  of  steps  we  have  discussed. 


Algorithm  3  RSC  Image  Reconstruction  Algorithm 

1.  Estimate  Visibilities  7;  (c.f.  Equations  (4.66)-(4.67)) 

2  Select  N-Spectra  Relations  via  Minimum  Cycle  Basis 

2.1  choose  decoupling  approximation  according  to  SNR  regime  (c.f.  Proposition  4.4.9  or 
4.4.10) 

2.2  assign  baseline  edge  weights  accordingly 

2.3  enumerate  the  Horton  cycles  of  the  interferometric  graph 

2.4  execute  greedy  algorithm  on  set  of  Horton  cycles  to  find  minimum-variance  set  of 
wz  —  {Nap  —  1)  n-spectra 

3.  Average  the  Selected  N-Spectra 

4.  Unwrap  the  generalized  closure  phases  corresponding  to  these  N-Spectra  (c.f.  Sec¬ 
tion  4.4.1) 

5.  Solve  generalized  least-squares  problem  with  unwrapped  generalized  closures 

6.  De-convolve  the  estimated  complex  visibilities  to  produce  image  reconstruction 
using  the  regularization  in  Equation  (4.68)  or  other  regularization  technique 


4.6  Algorithm  Performance 

In  this  Section,  we  present  the  results  of  application  of  Algorithm  3  to  a  simulated  scenario 
of  imaging  a  structured  object  space. 

4.6.1  Sensitivity  Limits 

A  useful  benchmark  for  our  results  is  provided  by  the  Cramer-Rao  Lower  Bound  (CRLB) 
for  interferometric  phase  estimation  in  the  absence  of  atmospheric  turbulence.  We  begin  by 


95 


defining  our  fringe  measurement  model  in  the  standard  way: 


p  =  A 


^cv 


Ycv 


(4.69) 


where  p  is  the  vector  of  pixel  counts,  Xcv,  and  jcv  are  respectively  the  (^'’)  real  and  imaginary 
components  of  the  complex-visibility  components,  respectively,  and  A  is  the  matrix  mapping 
the  real  and  imaginary  parts  of  the  complex  visibilities  to  pixels  on  the  focal  plane  (often 
called  the  visibility-to-pixel  matrix  V2PM  in  the  literature).  Recall  from  Section  1.4  that 
rows  of  A  are  the  sinusoidal  functions  associated  with  each  fringe  generated  by  the  beam 
combiner. 

Leveraging  the  analysis  in  Chapter  1  (see  Section  1.4)  we  can  compute  the  Fisher- 
Information  Matrix  (FIM)  for  interferometric  estimation  with  a  Fizeau  beam-combiner 
as: 


(a^A7;a)^'  (4.70) 

where  is  a  diagonal  matrix  whose  diagonal  entries  are  the  expected  values  of  the  photon 
counts  at  each  detector  in  the  array. 

The  CRLB  for  the  covariance  of  these  vector  parameters  is  then  given  by: 

(4.71) 

where  the  notation  >  in  this  context  means  that  the  matrix  difference  on  the  left-hand  side 
is  positive-semidefinite. 

The  above  bound  applies  to  single-frame  fringe  phasor  estimation  for  a  non-redundant 
array.  Since  we  are  instead  evaluating  phase  estimation  schemes  for  a  redundant  array 
given  multiple  frames  of  data,  we  must  incorporate  these  transformations  into  the  bound 
(Kay,  1993).  Namely,  for  a  redundant  array,  the  parameter  vector  must  be  shortened  to  the 
d  distinct  complex- visibility  phasors  x^^^^  -|-  iyf)Ycv,d-  define  the  vector  as  the 

^The  interested  reader  is  directed  to  Zmuidzinas  (2003)  for  the  original  derivation 
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concatenation  of  and  ycv,d-  The  fringe  model  is  then  given  by:  p  =  where 

R  is  a  2 (^'’)  X  2d  matrix  mapping  the  real  and  imaginary  parts  of  the  distinct  complex 
visibilities  to  those  of  the  generated  fringes.  The  FIM  then  becomes: 

Iz.M  =  (R^A^A7;Ar)‘'  (4.72) 

Defining  the  function  Vg  =  arctan  the  CRLB  for  the  phase  covariance  matrix  Eg  can 

^  ^  ^CV,d  ^ 

be  expressed  as: 


_ 1  {^cv,d)  ^  ^  p  (4  73) 

^frames  ^^cv,d  ^^cv,d 

where  is  the  Jacobian  matrix  of  the  distinct  Fourier  phases  with  respect  to  the 

distinct  complex-visibility  phasors. 

Since  the  left-hand-side  of  Inequality  (4.73)  is  positive  semi-definite  (PSD),  and  the 
diagonal  entries  of  a  PSD  matrix  must  be  non-negative,  we  arrive  at  the  following  useful 
bound  for  the  variance  of  the  estimated  phases: 

var{ei)  >  — 

frames 

4.6.2  Simulation  Results  for  Algorithm  3 

In  this  section,  we  provide  the  results  of  a  simulation  in  which  Algorithm  3  was  applied 
to  both  interferometric  architectures.  For  these  simulations  we  used  an  RSC  pattern  of  the 
Y-pattern  type  as  shown  in  Figure  4.4.  The  corresponding  UV-sampling  for  this  pattern 
is  displayed  in  Figure  4.5.  To  prevent  aliasing  on  the  simulated  focal  plane,  this  aperture 
pattern  was  mapped  onto  the  Golay  non-redundant  pattern  (Golay,  1970)  shown  in  Figure 
4.6  for  fringe  generation;  the  model  emulates  the  redundant-to-non-redundant  pupil  re¬ 
mapping  technique  developed  by  Perrin  et  al.  (2006)  and  described  in  the  Introduction. 
Our  simulation  assumed  Poisson-distributed  shot-noise  and  idealized  detectors  with  zero 
read  noise,  and  so  is  representative  of  a  shot-noise-dominated  scenario.  For  bispectrum 


cv,d} 


j  — 1  ^S0\^cv,d) 

^^cv,d 


(4.74) 
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observables,  we  used  the  unbiased  estimator  reported  by  Gordon,  J.  A.  and  Buscher,  D.  F. 

(2012): 


gub  =  ZaZbZc  -  \Za\^  “  |Zfo|^  -  IZcf’  +  lNapfl  (4.75) 

where  Za,  zj,,  and  Zc  are  the  fringe  phasors  associated  with  the  three  sides  of  a  bispectrum 
triangle,  and  n  is  an  estimate  of  the  total  number  of  photons  per  aperture  incident  upon  the 
focal  plane.  ^ 

The  target  selected  was  NASA's  Cloud- Aerosol  Lidar  and  Infrared  Pathfinder  Satellite 
Observations  (CALIPSO)  satellite  (Hill,  2008)  for  which  the  truth  image  is  shown  in  Figure 
4.7.  Figure  4.8  shows  the  image  at  the  resolution  attainable  by  the  pattern.  Two  flux 
levels  (2000  photoelectrons /aperture /frame,  and  500  photoelectrons /aperture /frame)  were 
considered  for  these  simulations.  Comparative  error  analysis  using  Equation  (4.60)  showed 
a  lower  predicted  RMS  phase  error  for  the  decoupling  approximation  in  Proposition  4.4.10 
than  that  in  Proposition  4.4.9,  and  hence  the  former  was  chosen  in  Algorithm  3.  Bispectra 
and  n-spectra  were  integrated  for  5e4  frames.  An  implementation  of  the  Bellman-Ford- 
Moore  shortest-path  algorithm  (O'Cormor,  2012)  was  used  to  generate  a  minimum  cycle 
basis  as  per  Algorithm  3. 

To  show  the  potential  impact  of  generalizing  the  closure  phase  notion.  Algorithm  3  was 
compared  with  an  analogous  algorithm  which  instead  used  the  minimum-variance  set  of 
(N„p  independent  traditional  (i.e.  three-baseline)  closure  phases.  This  minimum  set  was 
generated  via  application  of  the  greedy  algorithm  described  above  to  the  set  of  all  (^“p~^). 

Numerical  results  for  with  the  pairwise  and  Fizeau  architectures  in  the  higher  flux 
scenario  are  shown  in  Figures  4.9  and  4.10,  respectively.  As  expected,  the  Fizeau  architecture 
proves  to  be  more  sensitive  than  the  Pairwise  architecture,  which  corroborates  analysis  by 
Zmuidzinas  (2003).  In  these  plots,  the  RMS  phase  error  for  each  distinct  spatial  frequency  is 
plotted  against  the  visibility  of  the  target  at  that  spatial  frequency.  Figures  4.11  and  4.12 

^We  did  not  derive  the  more  complicated  bias  corrections  for  higher-order  nspectra  as  empirical  results 
indicated  that  at  the  flux  levels  considered,  the  importance  of  these  bias  corrections  declined  with  the  order  of 
the  nspectra. 
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Figure  4.4:  RSC  aperture  pattern  used  in  simulation 


(/> 

.4— • 

’c 

D 

T3 

<U 

N 

TO 

£ 


■10 


■15 


o 

o 

o 

o 

o 

o 

o 

°c 

5 

o' 

o 

o 

o  o  o  o 

o 

o 

o 

o 

o 

o 

c 

o 

o 

0 

o 

o 

-10  -5  0  5  10 

X  (normalized  units) 


15 


Figure  4.5:  UV-sampling  for  RSC  pattern 
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Figure  4.6:  Golay  non-redundant  beam-combiner  pattern 
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Figure  4.7:  Truth  image  for  simulation:  the  CALIPSO  satellite 
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Figure  4.8:  Truth  image  at  the  resolution  of  the  interferometric  pattern 


respectively  show  Fizeau  performance  for  the  higher  flux  scenario  at  a  shorter  integration 
time  (le3  frames),  and  for  the  lower  flux  scenario  at  the  long  integration  time  (5e4  frames). 
Note  that  in  the  use  of  independent  generalized  closure  phases  according  to  Algorithm 
3  outperforms  the  analogous  scheme  in  which  only  traditional  closures  are  used;  the 
root-mean-squared  (RMS)  errors  are  noticeably  lower  for  the  former. 

It  is  interesting  to  examine  performance  relative  to  the  CRLB  described  in  Section 
1.4.  Recall  that  our  CRLB  is  an  atmosphere-oracle  bound  and  hence  it  should  bound  phase 
estimation  for  the  case  of  a  stationary  fringe  averaged  over  many  frames.  The  variance  of 
such  an  estimator  can  be  easily  calculated  by  leveraging  the  analysis  in  Section  4.3.  Namely 
the  pseudovariance  of  the  phasor  is  given  by  applying  Equation  (4.12)  after  averaging,  so 
that  SNR  of  the  averaged  fringe  phasor  z  is  given  by: 


2  _  ^ 

~  SNRy 


oi 


K 


ap 


(4.76) 


||E[z]||^  frames^T^ 

We  compute  the  predicted  phase  RMS  by  taking  the  square-root  of  this  expression,  and 
add  it  to  the  plots.  From  the  plots,  we  see  that  CRLB  is  virtually  identical  to  this  expression. 
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Figure  4.9:  Pairwise  Phase  Recovery  Results  for  Flux  n  =  2e3  pelapiframe,  BeA  frames 


except  for  isolated  dips  in  the  CRLB.  These  dips  correspond  to  the  redundant  baselines, 
which  as  expected,  permit  increased  phase  sensitivity  due  to  their  multiplicity. 

In  the  moderate-flux  scenario  (n  =  2e3),  our  algorithm's  performance  is  within  a  factor 
of  2  in  phase  error  from  the  CRLB.  As  the  flux  drops  to  n  =  5e2,  the  performance  begins 
to  diverge  from  the  CRLB  as  the  closure  phase  variance  leaves  the  regime  described  by 
Equation  (4.25)  and  enters  the  regime  described  by  Equation  (4.24).  That  is,  the  variance 
is  no  longer  accurately  modeled  simply  by  a  linear  combination  of  the  individual  phase 
variances  of  the  form  in  Equation  (4.76);  rather  it  becomes  inversely-proportional  to  the 
product  of  the  squared-visibilities.  The  rapid  decrease  in  fidelity  of  the  closure  phase  at 
low-SNR  is  well-known  in  interferometry  (Kulkarni  et  ah,  1991). 

Sample  image  reconstructions  are  shown  in  Eigure  4.13.  The  reconstructions  derived 
from  generalized  closures  show  greater  fidelity  to  the  true  image  than  those  derived  from 
traditional  closures,  and  thereby  corroborate  the  numerical  analysis  presented  in  the  plots. 

It  is  noteworthy  that  while  our  algorithm  attempts  to  find  a  minimum-variance  set  of 
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Figure  4.10:  Fizeau  Phase  Recovery  Results  for  Flux  n  =  2e3  pefapiframe,  5eA  frames 
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Figure  4.11:  Fizeau  Phase  Recovery  Results  for  Flux  n  =  2e3  pejaplframe,  le3  frames 
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Figure  4.12:  Fizeau  Phase  Recovery  Results  for  Flux  n  =  5e2  pe/ap/frame,  BeA  frames 


n-spectra,  this  set  is  not  necessarily  the  optimum  set  for  phase  estimation;  our  algorithm 
does  not  account  for  the  conditioning  of  the  matrix  Coc  resulting  from  a  particular  choice  of 
nspectra.  In  fact  there  may  be  some  cases  in  which  the  use  of  traditional  closures  results 
in  lower  error  in  the  phase  error  due  to  better  conditioning  in  the  corresponding  Coc-  In 
practice  the  decision  to  use  traditional  or  generalized  closures  can  be  resolved  by  examining 
the  phase  errors  predicted  by  Equation  (4.60),  for  basic  solutions,  or  Equation  (4.64)  for 
pseudo-inverse  solutions. 

4.6.3  Generalized  Closures  in  Non-Linear  Least  Squares  Approaches 

This  paper  has  proposed  both  novel  reconstruction  methodology  as  well  as  new  interfero¬ 
metric  observables.  A  natural  question  arises  as  to  whether  the  advantages  of  the  latter  are 
retained  with  other,  existing  methodology.  In  particular,  we  consider  the  class  of  techniques 
which  solve  the  non-linear  least  squares  inference  problem  involving  bispectrum  phasors 
(Gorham  et  al,  1989),  (Negrete-Regagnon,  1996)  (see  Section  4.2). 
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Figure  4.13:  Fizeau  Image  Reconstruction  Results.  The  reconstructions  in  the  left  column  used  traditional, 
three-baseline  observables,  whereas  those  in  the  right  column  used  generalized  closures  selected  according 
to  Algorithm  3.  (top  row)  n  =  2e3  photoelectrons/aperture/frame  (pe/ap/frame),  Sei  frames,  (middle  row) 
n  =  2e3  pe/ap/frame,  le3  frames,  (bottom  row)  n  =  5e2  pe/ap/frame,  Sei  frames. 
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While  these  algorithms  bring  the  complexities  involved  in  non-linear  optimization  (e.g. 
possible  stagnation  in  local  minima  and/ or  slow  convergence),  they  also  feature  the  ability 
to  utilize  any  number  of  closure  relations.  Algorithm  3  restricts  attention  to  closure  relations 
which  form  cycle  bases  in  order  to  guarantee  well-posed  Fourier  phase  recovery  while 
keeping  the  size  of  the  associated  CVP-unwrapping  problem  tractable.  However,  as  noted 
by  Kulkarni  et  al.  (1991),  fitting  to  all  (^'’)  closures  will  in  principle  improve  estimation 
performance  when  the  per-frame  flux  is  low,  due  to  the  fact  that  closure  phases  de-correlate 
as  the  flux  decreases  Specifically,  as  computed  by  Kulkarni  et  al.  (1991)  for  traditional 
closures  in  the  pairwise  case,  the  correlation  coefficient  }i  of  two  closures  sharing  a  common 
baseline  approaches  ^  in  the  high-SNR  regime  (i.e.  for  n7^  3>  1)  as  one  would  expect  its 
behavior  in  the  low-SNR  regime  can  be  described  as  ^  ~ 

To  assess  the  performance  of  generalized  closures  relative  to  that  using  the  complete  set 
of  closures,  we  consider  the  algorithm  developed  first  by  Gorham  et  al.  (1989)  which  works 
the  normalized  bispectrum  instead  of  the  closure  phase.  We  used  the  MATLAB®  non-linear 
least-squares  solver  known  as  Isqnonlin  (MATLAB,  2012)  to  minimize  the  objective  Y2  in 
Equation  (4.3).  This  solver  obtains  quadratic  approximations  of  the  objective  at  each  step  and 
moves  towards  the  minimum  of  this  approximation,  at  which  point  another  approximation 
is  computed  and  the  process  repeats.  The  weights  in  Y2  were  set  as  in  Gorham  et  al.  (1989): 


Wi  = 


Ik-.ll 


(4.77) 


where  ||gi||  is  the  magnitude  of  the  averaged  unbiased  nspectrum,  and  is  the  empirical 
standard  deviation  of  its  quadrature  components. 

Results  of  applying  this  solver  to  the  lower-SNR  scenario  of  the  previous  section  are 
given  in  Figure  4.14.  The  top  row  shows  reconstructions  and  associated  convergence 
times  for  the  traditional  and  generalized  cycle  bases,  respectively,  given  the  same  random 
initialization  point  in  each  case.  The  bottom  row  gives  analogous  results  for  the  case  in 


®The  interested  reader  is  directed  to  Buscher  (2015)  for  a  dedicated  discussion  of  this  issue 
®Here  Kulkarni  et  al.  (1991)  assumes  a  common  visibility  7  among  all  baselines  in  the  closure 
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Figure  4.14:  (top  row)  Reconstructed  images  and  convergence  times  for  traditional-closure  (left)  and 
generalized-closure  (right)  cycle  bases  (bottom  row)  Reconstructed  images  and  elapsed  running  times  with  all 
(  “f")  traditional  closures  at  iteration  20  (left)  and  iteration  40  (right) . 


Standard  Cycle  Basis 
Convergence  Time  -^9  sec 


Generalized  Minimum  Cycle  Basis 
Convergence  time  ‘“Z  sec 


All  Closures  All  Closures 

20  iterations,  ^71  sec  40  iterations,  *-139  sec 


which  all  traditional  closures  are  used.  Two  snapshots  at  iteration  counts  20  and  40  are 
shown.  Iterations  after  40  resulted  in  negligible  change  to  the  image  quality.  These  results 
suggest  that  generalized  closures  can  provide  at  least  the  sensitivity  of  traditional  closures. 
At  the  same  time,  they  benefit  from  the  increased  convergence  speed  afforded  by  a  problem 
size  proportional  to  the  size  of  cycle  basis  (i.e.  oc  as  opposed  to  the  size  of  the  set  of  all 
closures  (i.e.  oc  N^p).  These  results  hence  show  that  generalized  closures  can  serve  as  more 
efficient  sources  of  phase  information  than  traditional  closures. 

It  is  noteworthy  that  the  selection  of  a  minimum-variance  set  of  generalized  closures 
involves  computational  overhead  in  the  execution  of  the  greedy  algorithm.  However,  it  is 
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Figure  4.15:  Reconstructed  images  and  convergence  time  for  generalized-closure  cycle  basis  derived  from 
Minimum-Spanning-Tree 


Fundamental  Cycle  Basis  (Minimum  Spanning  Tree) 
Convergence  Time  *-3  sec 


anticipated  that  in  many  cases  we  may  be  able  to  obtain  a  reliable  surrogate  for  this  set  by 
instead  by  forming  a  fundamental  cycle  basis  associated  with  minimum- variance  sparming 
tree  of  the  interferometric  graph.  Constructing  this  alternate  basis,  which  was  introduced  in 
Section  3.4.3,  obviates  the  need  for  execution  of  the  greedy  algorithm;  it  merely  requires 
the  execution  of  a  Minimum-weight  Sparming-tree  algorithm  (e.g.  Prim-Jarnik  Algorithm 
(Prim,  1957))  with  each  edge  weight  set  to  reciprocal  of  the  squared  visibility  associated 
with  baseline  Indeed,  many  of  the  elements  of  the  minimum  cycle  basis  for  our  pattern 
and  scenario  shows  that  they  are  also  members  of  the  fundamental  basis  associated  with 
the  minimum-weight  spanning  tree  of  the  interferometric  graph  shown  in  Figure  4.16.  Not 
surprisingly,  the  reconstruction  with  this  basis  shown  in  Figure  4.15  approaches  of  the 
quality  that  for  the  minimum-variance  set  (compare  with  image  in  the  upper-right  panel  of 
Figure  4.14). 

While  the  analysis  of  Kulkarni  et  al.  (1991)  revealed  the  advantage  of  supplementing  a 
cycle  basis  with  additional  traditional  closures  in  the  pairwise  case,  this  advantage  clearly 
applies  to  generalized  closure  phases  in  the  Fizeau  case.  Our  CRLB  comparison  above  (i.e. 
Figures  4.10  and  4.11)  confirms  that  there  is  limited  scope  for  improvement  in  the  regime 

^For  our  simulation,  we  used  a  MATLAB®  implementation  of  Prim's  algorithm  which  is  available  online 
(Greenbaum,  2007) 
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Figure  4.16:  The  minimum-variance  spanning  tree 


Minimum  Spanning  Tree 


«  60.  In  the  regime  explored  in  the  lower-SNR  scenario  (see  Figure  4.12),  however,  there 
is  certainly  a  CRLB-gap  which  may  be  closed  with  appropriate  selection  of  supplementary 
generalized  closures.  We  leave  the  development  of  algorithms  to  perform  such  a  selection 
efficiently  for  future  work. 

4.7  Conclusion 

In  this  chapter,  we  have  developed  a  novel  method  for  interferometric  imaging  which 
employs  RSC  techniques  and  a  generalized  notion  of  bispectrum  observable  (the  n-spectrum). 
We  have  established  that  phase  estimation  from  these  observables  is  well-posed  for  valid 
RSC  arrays.  We  have  also  provided  a  fast  algorithm  for  selection  of  a  minimum-variance  set 
of  these  observables,  which  is  based  on  the  concept  of  minimum  cycle  basis  in  graph  theory. 
Then,  leveraging  lattice-theory  techniques  first  proposed  by  Larmes  and  Anterrieu  (1999)  for 
unwrapping  of  the  closure  phases,  as  well  as  novel  techniques  from  compressed  sensing 
for  image  reconstruction  in  the  presence  of  Fourier  undersampling,  we  have  proposed  a 
new  algorithm  for  image  reconstruction  in  optical  interferometry.  We  have  shown  that 
the  performance  of  our  Fourier  phase  estimation  part  of  our  algorithm  can  be  quantified 
from  first  principles  using  standard  linear  estimation  theory.  We  have  used  simulation  at 
different  photon-flux  levels  to  corroborate  this  analysis,  and  to  show  the  potential  advantage 
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of  performing  inference  on  the  n-spectrum  observable  relative  to  its  classical  counterpart  - 
the  bi-spectrum.  It  is  our  hope  that  both  the  theoretical  framework  employed  in  this  chapter, 
as  well  as  the  practical  algorithm  itself,  will  prove  useful  to  future  users  in  designing 
RSC-based  interferometric  systems  and  processing  their  data. 
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Chapter  5 


Conclusions 


This  thesis  has  presented  both  theoretical  and  practical  contributions  to  the  study  of  robust 
imaging  in  optical  interferometry.  Our  principal  theoretical  contribution  is  that  we  have 
developed  the  notion  of  the  wrap-invariant  RSC  pattern.  We  have  shown  that  wrap-invariant 
aperture  patterns  allow  for  unique  recovery  of  the  Fourier  phase  of  the  object  under 
observation.  These  results  are  fundamental  to  the  RSC  method  in  the  sense  that  they 
apply  to  the  entire  family  of  techniques  which  fit  the  Fourier  phase  either  directly  to  the 
complex  visibilities  measured  by  the  interferometer,  or  atmosphere-invariant  combinations 
of  these  such  as  the  bispectrum.  We  summarize  our  uniqueness  results  in  Table  5.1, 
classifying  them  according  to  the  Approach  type  (i.e.  Phase  vs.  Phasor),  and  the  observable 
used  in  the  phase  recovery  (i.e.  direct  measurements  of  the  fringe  phase  or  phasor  vs. 
atmosphere-invariant  combination).  References  to  other  work  in  the  literature  employing 
each  observable-approach  pair  are  also  provided  for  context. 

We  have  leveraged  these  theoretical  results  to  construct  a  novel  algorithmic  framework  for 
image  reconstruction.  In  our  framework,  we  have  generalized  classical  atmosphere-invariant 
observables  to  improve  estimation  of  the  Fourier  phase  of  complex  objects.  The  ultimate 
goal  of  this  effort  was  to  achieve  near-immunity  to  the  effect  of  atmospheric  turbulence 
in  terms  of  Fourier-phase  estimation  accuracy,  or  more  concretely,  to  show  performance 
approaching  the  atmosphere-oracle  Cramer  Rao  Lower  Bound.  Using  our  algorithm,  we 


111 


provide  strong  theoretical  and  empirical  evidence  that  we  can  achieve  this  goal  in  the 
regime  in  which  the  per-frame  photon  flux  is  on  the  order  of  10  photons  per  interferometric 
fringe.  We  have  seen  that  the  increased  sensitivity  of  our  generalized  observables  extends 
to  other  existing  phase-recovery  algorithms  used  in  the  field.  Lastly,  we  demonstrate  that 
the  accurate  phase  estimation  provided  by  our  framework  allows  powerful  techniques  from 
Compressed  Sensing  to  generate  high-quality  interferometric  imagery. 
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Appendix  A 


Appendix  to  Chapter  1 

A.l  Sinusoidal  Dependence  of  Field  on  Interferometer  Focal  Plane 

To  aid  in  the  derivation.  Figure  A.l  shows  the  geometry  involved  in  beam  combination  at 
the  focal  plane. 

Note  that  we  can  express  the  vector  position  of  pixel  at  coordinate  p  relative  to  the 
position  of  aperture  j  as: 


Xj  =  d  +  p-Tj  (A.l) 

Moreover  the  magnitude  of  the  wavevector  ky  is  while  the  unit  vector  in  the  direction 
of  ky  is  given  by: 

h-  +  “I 

Villi'  +ld/ll 

Hence  the  phase  added  to  the  field  due  to  the  path  between  the  beam  combiner  and 
focal  plane  (at  pixel  p)  is  given  by: 

9  JT 

^  ^  +  d]  •  [d  +  p  -  ry]  (A.3) 

Expanding  the  dot-product  and  noting  that  d  •  p  =  0,  d  •  ry  =  0,  and  d  •  d  is  constant 
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3 


Figure  A.l:  Beam-Combination  Geometry 


independent  of  apertures,  we  obtain  (up  to  a  constant  offset  independent  of  aperture): 


ky  •  Xj  = 


In 
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IJI|2  I  II  ||2 

d  +  r; 


p)  +  <pj 


(A.4) 


where  (pj  := 


27r  r 
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120 


Appendix  B 


Appendix  to  Chapter  3 

B.l  Proof  of  Lemma  3.6.2 

In  this  section,  we  prove  Lemma  3.6.2,  which  states  the  following: 

Given  wrap-invariance,  the  (column)  vector  UxM^e^  has  integer  entries  below  row  2. 

Given  that  we  have  a  wrap-invariant  pattern,  we  know  that  the  elementary  divisors  of 
M  are  all  1.  Hence  there  exists  an  integer  vector  ko  such  that: 

el  =  Mko  (B.l) 

Substituting  Equations  (C.47)  and  (3.26)  into  Equation  (C.40),  we  obtain: 

e^  =  V^E+UjU^L^Vjko  (B.2) 

Noting  that  Ucr  is  orthogonal,  this  equation  can  be  simplified  to 

e^  =  (V^  -  N)V^ko  (B.3) 

where  N  is  a  matrix  of  the  same  size  as  Ya,  which  is  zero  except  for  the  last  three 
columns.  These  last  three  columns  are  identical  to  those  of  and  hence  by  Lemma  4.1 
comprise  an  orthogonal  basis  for  the  nullspace  of  M.  Noting  the  orthogonality  of  this 
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can  be  further  simplified  to: 


=  ko  -  NVjko  (B.4) 

To  proceed,  the  following  Definition  will  be  useful: 

Definition  A.l  (The  canonical  basis  for  the  nullspace  of  M):  The  canonical  basis 
{w,},  i  G  1,2,3  for  the  three-dimensional  nullspace  of  M  can  be  derived  trivially  from  the 
well-known  tilt-position  degeneracy  in  interferometry  described  in  Section  3.4.1  (Wieringa, 
1992).  Namely,  we  can  define  the  basis  as  the  columns  of  a  {N  +  d)  x3  matrix  'WkeriM) 
follows: 


Wfer(M)  =  {W1|W2|W3}  = 


Wxl 


0 


Arx  Ary 


(B.5) 


where  Tx  and  ry  are  the  x—  and  y— positional  coordinates  of  the  apertures  associated 
with  each  row,  respectively,  and  Ar^  and  Ary  their  respective  pairwise  differences.  □ 

Note  that  each  of  the  three  non-zero  column  vectors  {vjt},  k  G  1, 2, 3  in  N  can  be  expressed 
as  a  linear  combinations  of  the  elements  of  the  canonical  basis  { w,},  i  G  1, 2, 3  defined  above, 

i.e. 


V/t  =  fliWi  -|-  fl2W2  -|-  fl3W3  (B.6) 

As  in  Section  3.4.1,  let  us  again  use  K  to  denote  the  set  of  indices  in  eg-  associated  with 
the  Fourier  phases  (as  opposed  to  the  piston  phases),  and  their  corresponding  rows  in  N. 
Hence  we  have: 


Uxe^,x  =  Uxko,K  -  UxNxVjko  (B.7) 

Since  Ux  is  an  integer  matrix,  the  first  term  is  clearly  integral.  Let  us  then  examine  the 
second  term,  and  in  particular,  the  product  UxN^.  By  substitution  from  Equation  (C.51),  we 
see  that  or  any  of  the  three  non-zero  columns  ^  of  Nx,  we  have: 
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UxVk,^:  =  fllUxWi^X  +  fl2UxW2,x  +  ^sUxws^x  (B.8) 

where  {wy  x}  are  the  vectors  comprising  the  lower  partition  of  Equation  (B.5).  The  first 
term  in  Equation  (C.53)  is  trivially  0  since  that  wi j  is  the  zero-vector.  Now  recall  that  Ux  is 
a  matrix  which  annihilates  all  the  spatial  frequencies  in  the  matrix  X  below  row  2.  But  from 
Definition  A.l,  these  spatial  frequencies  are  identically  the  contents  of  the  two  columns  W2,x 
and  W3  X  (up  to  a  uniform  scaling  factor).  Therefore  the  column  vector  in  Equation  (C.53)  is 
zero  below  row  2.  This  means  in  turn  that  the  second  term  in  Equation  (C.52)  is  zero  below 
row  2,  and  hence  that  Uxe^T-j  is  integral  below  row  2  (since  the  first  term  is  integral).  □ 
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Appendix  C 


Appendix  to  Chapter  4 


C.l  Fizeau  Variance  Approximations 

C.l.l  Variance  Decomposition 

In  this  section  we  justify  our  approximation  for  the  variance  of  the  n-spectrum  for  the  Fizeau 
architecture.  Our  goal  here  is  not  to  compute  all  of  the  numerous  terms  in  the  variance,  but 
rather  to  present  the  mathematical  intuition  behind  the  approximation  we  have  used.  The 
general  formula  for  the  variance  of  an  n-spectra  G  is: 

For  simplicity,  we  analyze  the  standard  bispectrum  case  (o  =  3).  The  analysis  extends 
naturally  to  the  n-spectrum.  Note  that  by  extension  of  Equation  (4.8)  the  first  term  can  be 
written  as: 


(C.l) 


E  {^{pci)^{pb)q{pc)q{pd)q{pe)q{pf)) 

Pa,Ph,Pc,Pd,Pe,Pf 


X  £i‘^l{Pd-Pa)  ^icOliPe-Pb)  gi<^3{Pf-Pc)  (^2) 
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Similarly,  the  second  term  in  Equation  (C.l)  can  be  written  as: 


(fr')  (fr' >  ^ 

E  {‘^ipa)^ipb)q{pc)){q{pd)q{pe)q{pf)) 

Va,Ph,Pc,Pd,Pe,Pf 


X  gi‘^l{Pd-Pa)  gicOliPe-pb)  gicOsiPf-Pc)  ((^3) 


Proceeding  analogously  to  Equation  (4.10),  we  utilize  the  general  formula  for  the 
moments  of  a  Poisson  distribution  to  perform  the  following  decomposition  of  the  first 
factor  in  the  summand  in  Equation  (C.2)  (Kulkarni  et  ah,  1991): 
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{‘^ipa)q{pb)q{pc)q{pd)‘]{pe)q{pf))  =  (C.4) 

+{(i{pa)){q{pb)){q{pc)){qipd)){q{pe)){q{pf))  (c.5) 

+^p=p„=pAqip)}{q(pb)}{q{pc)}{q{pe)}{q{pf)}  (c.6) 

+Sp=p,=pAq{p)){qipa)){q{pc)){q{pd)){q{pf))  (c.7) 

+^p=p,=Pf{q{p)){q(pa)){q(Pb)){q{pd)){q(pe))  (c.8) 

+Sp^p,=pAqip))^P'=P,=Pe{q{p')){qipc)){q{pf))  (c.9) 

+^p=^Pa=pd{qip))^p'^Pc=pf{qip )) (qipb)) {qipe))  (c.io) 

~^^p=Pb=pe{qip))^p'=pc=pf{qip )) {qipa)) (qipc))  (c.ii) 

~^^p=Pa=Pd{qip))^p'^Pb=pAqip  ))^p"=pc=pf{qip  ))  (c.12) 

+Sp=Pa=pAqip)){qiPb)){q{pc)){q{Pd)){q{pf))  (c.is) 

+^p^Pa=Pf{qip)){q(pb)){q{pc)){q{Pd)){q(pe))  (C.i4) 

+^p^Pb=Pd{q(p)){q(pa)){q{pc)){q{pe)){q{pf))  (c.is) 

+^p=Pb=Pf{q{p)){q{pa)){q{pc)){q{pd)){q(pe))  (C.i6) 

+^P=Pc=Pd{q(p)){q(pA){q{pb)){q{pe)){q{pf))  (c.iz) 

+Sp=p,=pAqip)}{q(pa)}{qipb)}{q{pd)}{q{pf)}  (c.is) 

+^p=Pa=Pb{q{p)){q{pc)){q(pd)){q{pe)){q{pf))  (c.i9) 

+^p^p„=pAqip)){q{pb)){q{pd)){q{pe)){q{pf))  (c.20) 

+^p=Pb=pAqip)}{q{pa)){q{pd)){q{pe)){q{pf))  (c.21) 

+^P=Pd=pAqip)}{qipA}{qipb)}{q{pc)}{q{pf)}  (c.22) 

+Sp^p,=Pf{q{p)){q{pa)){q{pb)){q{pc)){q{pe))  (C.23) 

+^p=Pe=Pf{qip)}{q{pa}}{q{pb)}{qipc)}{qipd)}  (c.24) 

+Sp^Pa=Pb=pAq{p)){qipd)){q{pe)){q{pf))  (c.25) 

...  +  other  third-order  terms  (C.26) 

~^^p=pa=pb=pc=pd  {q{p)){q{pA){q{pf))  (c.27) 

...  fourth-order  terms  (C.28) 

...etc.  (C.29) 


The  total  number  of  partitions  (203)  in  the  summation  above  is  given  by  the  6th  Bell 
number.  Below  we  categorize  the  terms  by  their  order,  which  in  this  case  we  define  as  the 
largest  number  of  common  pixels  in  each  partition.  We  will  see  that  we  can  approximate 
the  pseudo-variance  to  reasonable  accuracy  by  including  a  particular  subset  of  the  Order-2 
terms.  Note  that  all  terms  in  the  analogous  decomposition  of  Equation  (C.3)  are  also  in 
Equation  (C.2)  and  hence  cancel  them  in  Equation  (C.l). 

C.1.2  Order-2  Terms 

These  terms  have  a  pair  of  common  pixels.  We  distinguish  between  the  following  two  cases: 

Case  1:  conjugate  pairs  These  terms  are  given  in  (C6)-(C12).  The  common  pixels  belong  to 
the  conjugate  pairs,  i.e.  p  '■=  p a  =  f  a,  f  '■=  fh  =  f z,  and  p  '■=  Pc  =  Pf,  respectively,  which 
also  appear  in  the  pairwise  case.  Let  us  assume  that  the  visibilities  associated  with  cui,  CO2, 
and  CO3  are  all  roughly  equal,  i.e.  7  =  71  =  72  =  73-  Consider  the  sum  of  the  first  of  these 
terms  (C6)  over  the  focal  plane,  i.e.: 


P=Pa=Pd,Pb,Pc,Pe,Pf 

=  E  ApEAp.e'“”’‘EAp,e''"‘+“’’'’' 

p-.=Pa=Pd  Pb  Pc 

Pe  Pf 

=  NapU^jhl  (C.30) 

Under  the  assumption  of  equal  visibilities,  this  expression  becomes  NapU^'y^-  Terms 
(C7)-(C8)  follow  this  same  form. 

Now  consider  term  (C9): 
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L 

P=Pa=Pd,P'  =  Pb  =  Pe,P^,Pf 

=  L  '^p  L  ApLAp,<-''"‘+“’'’' 

P--Pa=pd  p’--p^  =  p.  Pc 

X  ^Ap^e-'("i+^2)P/ 

Pf 

=  N%n‘^^l  (C.31) 

Under  the  assumption  of  equal  visibilities,  this  expression  becomes  N^^n^'y^.  Terms 
(ClO)-(Cll)  follow  this  same  form. 

Case  2:  mixed  non-conjugate  pairs 

We  now  consider  the  terms  in  which  a  pair  includes  two  distinct  baselines  (e.g.  (C13)- 
(C18)).  The  common  pixels  in  this  case  are  such  that  one  of  the  corresponding  baselines  is 
conjugated  and  the  other  is  not.  Taking,  for  example,  the  case  (C13)  in  which  p  :=  Pa  =  Pe, 
we  have: 


P=Pa=Pe,Pb,Pc,Pd,Pf 

X  {q{pci))e~™{q{pf))e~‘^-^^+^^^Pf 

p:=pa=pe  Pb  Pc 

E  Ap^e-'(^i+'^2)p/ 

Pd  Pf 

=  (C.32) 

where  ^wi-w2  is  the  visibility  of  the  fringe  of  spatial  frequency  coi  —  C02-  Note  that  of 
course  =  0  if  such  a  fringe  is  not  created  by  the  beam  combiner.  Otherwise,  assuming 

equal  strength  of  the  visibilities,  this  expression  reduces  to  Comparing  with  any  of 
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the  Case  1  terms  and  recalling  that  7  <  1,  we  see  that  the  latter  will  dominate  the  former  in 
the  regime  considered  in  this  paper  (n  «  le3,  7  ~  0-1/  ^ap  =  31). 

Case  3:  terms  in  decomposition  Equation  (C.3) 

This  Case  includes  terms  (C5),  as  well  as  (C19)-(C24).  The  common  pixels  are  such 
that  both  or  neither  of  the  corresponding  baselines  are  conjugated,  i.e.:  the  cases  pa  =  Pb, 
Va  =  Pc/  Pfc  =  Pc/  Pd  =  Pe/  Vd  =  fj ,  fe  =  fj-  These  terms  are  canceled  by  their  counterparts 
in  Equation  (C.2).  □ 

C.1.3  Higher  Order  Terms 

An  example  of  a  third-order  term  would  be  the  case  of  p  :=  pa  =  Pb  =  Vdr  whose  sum  is 
given  by: 


p:=Pa=Pb=Pd,Pc,Pi,Pf 


(C.33) 


E 

(C.34) 

Pa  =  Pb=Pd 

Pe 

-iW2PeJ^/^^^^-i{cVi+CV2)Pf 

(C.35) 

Pe 

Pf 

4  2  2 

=  n  7273 

(C.36) 

For  equal  visibilities,  we  have  n^7^-  Clearly  this  term  will  be  dominated  by  the  Case  1 
terms.  Other  high-order  terms  exhibit  a  similar  sharp  attenuation  which  allows  them  to  be 
neglected  for  practical  purposes.  □ 

Given  the  relative  strengths  of  the  terms  shown  above,  we  will  retain  Case  1  terms 
(C6)-(C12),  which  after  summation  and  factoring,  yield  Equation  (4.41).  Note  that  this 
amounts  to  asserting  that  distinct  baselines  are  (approximately)  statistically-independent. 
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despite  the  fact  that  they  are  measured  on  a  common  focal  plane,  i.e.  that: 


^fizeau  (G) 


!  =  1 


(C.37) 


!  =  1 


Applying  analogous  analysis  to  the  Fizeau  covariance  yields  the  following  approxima¬ 
tion: 


CT{Gi,G*2)ft 


n  i^kz*k)-  n  (2fc)(4) 

keinj  keinj 


n  (2/)  n  (4)  (C.38) 

lei\]  mej\i 


Note  that  this  matches  the  corresponding  pairwise  expression  in  Equation  (4.28).  Substi¬ 


tution  then  yields  the  final  Fizeau  covariance  expressions  in  Equations  (4.43)-(4.44). 


C.2  Proofs  of  Proposition  3.2  and  Corollary  3.3 

C2.1  Proof  of  Proposition  3.2 

Proposition  3.1:  For  a  valid  RSC  array,  the  columns  Ar^  and  Ar^  form  a  basis  for  the 
two-dimensional  nullspace  of  Coc- 

Proof:  To  see  that  the  two  columns  mentioned  belong  to  the  nullspace  of  Coc,  note 
that  each  solution  set  to  the  noiseless  version  of  Equation  (4.49)  above  remains  valid  after 
replacing  each  with  —  z  •  (r,  —  r^). 

We  then  need  to  establish  that  these  two  vectors  span  the  entire  nullspace  of  Coc  by 
showing  that  this  nullspace  is  two-dimensional.  Suppose  we  have  a  vector  w  which  is  in 
the  nullspace  Coc-  This  is  equivalent  to  the  either  of  the  following  conditions:  Mgw  =  0, 
or  Mgw  G  ker{Cmc)-  The  former  condition  is  not  possible  since  the  columns  sparming  the 
subspace  K  are  linearly-independent.  The  latter  condition  is  equivalent  to  the  condition  that 
w  G  R  n  L.  It  is  well-known  fact  that  for  any  two  subspaces  K  and  L,  we  have: 

dim{K  n  L)  =  dim{K)  +  dim{L)  —  dim{K  +  L)  (C.39) 

We  established  in  Section  3.4.1  that  dim{K  +  L)  =  rank(M)  =  d  +  N  —  3  for  a  valid  RSC 
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system.  Also  dim{K)  =  d,  since  the  K  is  spanned  by  d  linearly-independent  columns.  Finally 
dim{L)  =  N  —  1.  Substituting  into  Equation  (C.39),  we  see  that  dim{K  n  L)  =  2.  □ 

C.2.2  Proof  of  Corollary  3.3 

Corollary  3.2:  For  a  valid  RSC  array,  the  mapping  Coc  is  injective  up  to  an  image  shift. 

Proof:  Proposition  3.1  showed  that  the  nullspace  is  comprised  of  linear  combinations 
of  vectors  Ar;^  and  Ary.  Note  that  Ar^  and  Ary  are  simply  scaled  versions  of  the  x-  and 
y-spatial  frequency  vectors  in  the  array,  respectively.  Hence  adding  linear  combinations 
of  these  vectors  to  a  particular  RSC  phase  solution  merely  produces  phase  ramps  in  the 
Fourier  domain,  which  are  equivalent  to  translations  (or  shifts)  in  the  image  domain.  In 
other  words,  the  mapping  Coc  is  invertible  up  to  an  unknown  image  shift.  □ 

C.3  Proof  of  Proposition  3.6 

We  begin  the  Proof  with  the  following  Femma: 

Lemma  C.l:  The  final  2  columns  of  Ya  form  a  basis  for  the  nullspace  of  Coc- 
Proof:  This  follows  from  the  fact  Coc  is  rank-deficient  by  2,  and  standard  properties  of  the 
right  singular  vectors  comprising  Yg-  in  the  SVD.  (Bretscher,  2001)  □ 

Note  that  the  error  resulting  from  application  of  the  pseudo-inverse  C+  to  the  imwrapped 
vector  of  closures  will  be  given  by: 


iTzeg  =  C+ilnel)  (C.40) 

Let  us  express  the  spatial  frequencies  measured  by  an  array  as  two-element  vectors  of 
the  form  {c0x,C0y).  Let  X  be  the  d  x2  matrix  containing  these  spatial  frequencies.  Note  then 
that  the  phase-wrap  error  will  manifest  itself  merely  as  an  image  shift  if  and  only  if  this 
error  is  a  (modulo-27r)  phase  ramp,  i.e.  there  exists  a  2-element  shift  vector  z  and  an  integer 
vector  k  which  together  satisfy: 
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'Ino.cr  —  2nXz  =  27rk 


(C.41) 


Substituting  from  Equation  (C.40)  we  obtain: 

C  +  (2  Tie )  -  2  ttXz  =  2  Tik  (C  .42) 

Dividing  through  by  2k  we  obtain  the  equation:  —  Xz  =  k.  Note  that  each  element 

of  C+  can  be  expressed  as  some  rational  number  Similarly  we  first  assume  X  contains 
rational  spatial  frequencies  with  greatest  common  denominator  qx-  Then  we  can  multiply 
through  by  the  least-common-multiple  (LCM)  of  the  {qi}  and  qx  to  obtain  a  system  of 
equations  whose  coefficients  are  guaranteed  to  be  integer  (i.e.,  we  have  a  linear  Diophantine 
system).  Let  this  LCM  be  denoted  as  /.  Then  we  have,  after  rearranging  terms, 

/Xz  =  /(C+e;;-k)  (C.43) 

We  now  wish  to  determine  conditions  under  which  there  exist  vectors  k  and  z  satisfying 
this  overdetermined  Diophantine  system.  Applying  the  Smith  Normal  Lorm  decomposition 
(c.f.  Theorem  2.2)  to  the  matrix  IX  this  time,  and  noting  that  rank(X)  =  2,  we  have: 


Dx  =  Ux(/X)Vx  (C.44) 

where  Ux  and  Vx  are  unimodular  matrices  of  size  m  x  m  and  2x2,  respectively,  and 
Dx  is  a  rectangular  diagonal  matrix  whose  entries  are  zero  below  row  2. 

If  we  left-multiply  Equation  (C.43)  by  Ux  on  both  sides,  we  obtain: 

/UxXz  =  /Ux(C+e);-k)  (C.45) 

Using  Equation  (C.44)  and  the  fact  that  Vx  is  a  unimodular  (and  hence  invertible)  matrix, 
we  can  then  write: 


DxVx^z  =  /(UxC+e^-Uxk) 


(C.46) 
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We  are  now  in  position  to  prove  the  main  result  of  this  section,  which  is  preceded  by  the 
following  Lemma: 

Lemma  C.2:  Given  wrap-invariance,  the  (column)  vector  UxC+ e  has  integer  entries  below 
row  2. 

Proof: 

Given  that  the  elementary  divisors  of  Coc  are  all  1,  we  know  there  exists  an  integer  vector 
ko  such  that: 


el  =  Cocko  (G.47) 

Substituting  Equations  (G.47)  and  the  pseudo-inverse  definition  in  (4.63)  into  Equation 
(G.40),  we  obtain: 


(G.48) 

Noting  that  Ucr  is  orthogonal,  this  equation  can  be  simplified  to 

=  (V^  -  N)Vjko  (G.49) 

where  N  is  a  matrix  of  the  same  size  as  Ya,  which  is  zero  except  for  the  last  two  columns. 
These  last  two  columns  are  identical  to  those  of  Yg-,  and  hence  by  Lemma  4.1  comprise  an 
orthogonal  basis  for  the  nullspace  of  M.  Noting  the  orthogonality  of  V^,,  this  can  be  further 
simplified  to: 


=  ko  -  NVjko  (G.50) 

Note  that  each  of  the  two  non-zero  column  vectors  {v)t},A:  G  1,2,3  in  N  can  be  expressed 
as  a  linear  combinations  of  the  elements  of  the  canonical  basis  for  the  nullspace  of  Coc,  be. 


Vjt  =  fliWi  -|-  fl2W2 


Hence  we  have: 


(G.51) 
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Uxe^  =  Uxko  -  UxNxVjko  (C.52) 

Since  Ux  is  an  integer  matrix,  the  first  term  is  clearly  integral.  Let  us  then  examine  the 
second  term,  and  in  particular,  the  product  UxN.  By  substitution  from  Equation  (C.51),  we 
see  that  or  any  of  the  two  non-zero  columns  of  N,  we  have: 

UxVk  =  fliUxWi  -h  fl2UxW2  (C.53) 

The  first  term  in  Equation  (C.53)  is  trivially  0  since  that  wi  j  is  the  zero-vector.  Now 
recall  that  Ux  is  a  matrix  which  annihilates  all  the  spatial  frequencies  in  the  matrix  X  below 
row  2.  But  these  spatial  frequencies  are  identically  the  contents  of  the  two  columns  wi  and 
W2  (up  to  a  uniform  scaling  factor).  Therefore  the  column  vector  in  Equation  (C.53)  is  zero 
below  row  2.  This  means  in  turn  that  the  second  term  in  Equation  (C.52)  is  zero  below  row 
2,  and  hence  that  Uxe^r  is  integral  below  row  2  (since  the  first  term  is  integral).  □ 

To  utilize  Lemma  C.2,  we  first  re-arrange  the  Equation  (C.46)  above  so  that  it  reads: 

yDxVx^z  -  UxM+e^  =  -Uxk  (C.54) 

Let  v  =  jDxV^^z  —  UxM^e^.  Note  that  since  Dx  is  zero  below  row  2,  the  entries  of  v 
below  row  2  will  be  equal  to  those  of  (— UxM^e^,  which  are  integers  by  Lemma  C.2.  Now 
consider  the  first  and  second  entries  of  v.  Let  f  be  the  vector  containing  the  fractional  parts 
of  the  first  two  elements  of  vector  UxM^e)^,  and  let  A  be  the  invertible  matrix  consisting 
of  the  first  two  rows  of  jDxVj^^.  Without  loss  of  generality,  choose  z*  =  A^^f  so  that  the 
fractional  part  f  is  armihilated,  leaving  only  integer  elements  in  the  first  two  entries  of  v. 
Hence  we  now  have: 


v  =  -Uxk  (C.55) 

with  v  ensured  to  contain  only  integer  elements.  Since  Ux  is  unimodular,  the  vector 
k*  =  — U^^v  will  be  integral.  We  have  thus  found  a  pair  (z*,k*)  with  integer  k*  which 
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Figure  C.l:  Illustrating  Lemma  C.4.1  (a)  and  Lemma  C.4.2  (b) 


(a) 


(b) 


satisfies  the  Equation  (C.46).  Since  Equation  (C.46)  is  related  to  Equation  (C.43)  via  a 
unimodular  (and  hence  invertible)  mapping  Ux,  invariance  is  hence  proven.  □ 


C.4  Minimum  Cycle  Basis  Proofs 

Lemma  C.4.1.  (Liebchen  and  Rizzi,  2005):  Let  T-L  he  the  Horton  family  of  a  graph  in  which  there  is 
a  unique  minimum  path  between  each  pair  of  nodes.  Let  Cbe  a  cycle  of  G  which  is  not  in  Li.  Then 
there  exists  a  minimum  path  Pu,v  between  nodes  in  u  and  v  of  C  which  is  internally  disjoint  from  C. 

Proof:  Choose  an  edge  ab  in  C  connecting  nodes  a  and  b  (c.f.  Eigure  .  Since  C  is  not  a 
member  of  TL,  either  the  path  Pua  or  P^},  in  C  must  be  sub-optimal  (as  otherwise  C\ab 
would  be  a  shortest-path  tree).  □ 

Lemma  C.4.2.  Liebchen  and  Rizzi  (2005):  All  oriented  cycles  in  a  minimum  cycle  basis  of  a  directed 
graph  D  are  in  the  Horton  family. 

Proof:  Suppose  we  have  found  such  a  minimum  cycle  basis  B  =  Ci, ...,  Ct,  Cpi  and  it  was 
obtained  by  applying  the  greedy  algorithm  to  the  complete  set  of  cycles  of  D.  Furthermore 
without  loss  of  generality  assume  that  Ct  is  the  first  cycle  which  is  not  a  member  of  the 
Horton  set. 
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We  know  there  exist  two  nodes  u  and  v  in  Cf  such  that  the  shortest  path  Pu,v  between 
them  is  internally  disjoint  from  Ct-  Let  Ci  and  C2  denote  the  two  cycles  in  Ct  U  Pu,v  distinct 
from  Ct  and  with  opposite  orientations  on  Pu,v  Clearly  zv{Ci)  <  zv{Ct)  and  zv{C2)  < 
and  therefore  both  Ci  and  C2  can  be  expressed  as  linear  combinations  of  the  cycles  in 
{Cl,..., Cf-i}.  Note  that  Cf  =  Ci  +  C2,  which  implies  then  that  Ct  can  be  expressed  as 
linear  combinations  of  the  cycles  in  {Ci, ...,  Ct_i}.  But  this  is  impossible  since  the  greedy 
algorithm  chose  Cf  as  linearly-independent  of  these  cycles,  and  hence  we  have  the  necessary 
contradiction.  □ 
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